TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Mamba-ST: State Space Model for Efficient Style Transfer

Mamba-ST: State Space Model for Efficient Style Transfer

Filippo Botti, Alex Ergasti, Leonardo Rossi, Tomaso Fontanini, Claudio Ferrari, Massimo Bertozzi, Andrea Prati

2024-09-16IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025 2Style Transfer
PaperPDFCode(official)

Abstract

The goal of style transfer is, given a content image and a style source, generating a new image preserving the content but with the artistic representation of the style source. Most of the state-of-the-art architectures use transformers or diffusion-based models to perform this task, despite the heavy computational burden that they require. In particular, transformers use self- and cross-attention layers which have large memory footprint, while diffusion models require high inference time. To overcome the above, this paper explores a novel design of Mamba, an emergent State-Space Model (SSM), called Mamba-ST, to perform style transfer. To do so, we adapt Mamba linear equation to simulate the behavior of cross-attention layers, which are able to combine two separate embeddings into a single output, but drastically reducing memory usage and time complexity. We modified the Mamba's inner equations so to accept inputs from, and combine, two separate data streams. To the best of our knowledge, this is the first attempt to adapt the equations of SSMs to a vision task like style transfer without requiring any other module like cross-attention or custom normalization layers. An extensive set of experiments demonstrates the superiority and efficiency of our method in performing style transfer compared to transformers and diffusion models. Results show improved quality in terms of both ArtFID and FID metrics. Code is available at https://github.com/FilippoBotti/MambaST.

Results

TaskDatasetMetricValueModel
SketchWikiArtArtFID27.11Mamba-ST
Style TransferWikiArtArtFID27.11Mamba-ST
2D Human Pose EstimationWikiArtArtFID27.11Mamba-ST
2D ClassificationWikiArtArtFID27.11Mamba-ST
1 Image, 2*2 StitchiWikiArtArtFID27.11Mamba-ST
Drawing PicturesWikiArtArtFID27.11Mamba-ST

Related Papers

Transferring Styles for Reduced Texture Bias and Improved Robustness in Semantic Segmentation Networks2025-07-14AnyI2V: Animating Any Conditional Image with Motion Control2025-07-03Hita: Holistic Tokenizer for Autoregressive Image Generation2025-07-03SA-LUT: Spatial Adaptive 4D Look-Up Table for Photorealistic Style Transfer2025-06-16Fine-Grained control over Music Generation with Activation Steering2025-06-11Training-Free Identity Preservation in Stylized Image Generation Using Diffusion Models2025-06-07Towards Better Disentanglement in Non-Autoregressive Zero-Shot Expressive Voice Conversion2025-06-04SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation2025-06-03