TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Towards Robust Image-in-Audio Deep Steganography

Towards Robust Image-in-Audio Deep Steganography

Jaume Ros, Margarita Geleta, Jordi Pons, Xavier Giro-i-Nieto

2023-03-09Image Reconstruction
PaperPDFCode(official)

Abstract

The field of steganography has experienced a surge of interest due to the recent advancements in AI-powered techniques, particularly in the context of multimodal setups that enable the concealment of signals within signals of a different nature. The primary objectives of all steganographic methods are to achieve perceptual transparency, robustness, and large embedding capacity - which often present conflicting goals that classical methods have struggled to reconcile. This paper extends and enhances an existing image-in-audio deep steganography method by focusing on improving its robustness. The proposed enhancements include modifications to the loss function, utilization of the Short-Time Fourier Transform (STFT), introduction of redundancy in the encoding process for error correction, and buffering of additional information in the pixel subconvolution operation. The results demonstrate that our approach outperforms the existing method in terms of robustness and perceptual transparency.

Results

TaskDatasetMetricValueModel
Image ReconstructionAudio SetSSIM0.88Ours (STFT: magnitude, L1l=1) W-Replicate
Image ReconstructionAudio SetSSIM0.86Ours (STFT: magnitude, L1l=1) WS-Replicate

Related Papers

The model is the message: Lightweight convolutional autoencoders applied to noisy imaging data for planetary science and astrobiology2025-07-153D Magnetic Inverse Routine for Single-Segment Magnetic Field Images2025-07-15MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization2025-07-14Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation2025-07-11LangMamba: A Language-driven Mamba Framework for Low-dose CT Denoising with Vision-language Models2025-07-08Vision Transformer-Based Time-Series Image Reconstruction for Cloud-Filling Applications2025-06-24Cloud-Aware SAR Fusion for Enhanced Optical Sensing in Space Missions2025-06-22Client Selection Strategies for Federated Semantic Communications in Heterogeneous IoT Networks2025-06-20