TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Spatial Feature Transform

Spatial Feature Transform

Computer VisionIntroduced 20008 papers
Source Paper

Description

Spatial Feature Transform, or SFT, is a layer that generates affine transformation parameters for spatial-wise feature modulation, and was originally proposed within the context of image super-resolution. A Spatial Feature Transform (SFT) layer learns a mapping function M\mathcal{M}M that outputs a modulation parameter pair (γ,β)(\mathbf{\gamma}, \mathbf{\beta})(γ,β) based on some prior condition Ψ\PsiΨ. The learned parameter pair adaptively influences the outputs by applying an affine transformation spatially to each intermediate feature maps in an SR network. During testing, only a single forward pass is needed to generate the HR image given the LR input and segmentation probability maps.

More precisely, the prior Ψ\PsiΨ is modeled by a pair of affine transformation parameters (γ,β)(\mathbf{\gamma}, \mathbf{\beta})(γ,β) through a mapping function M:Ψ↦(γ,β)\mathcal{M}: \Psi \mapsto(\mathbf{\gamma}, \mathbf{\beta})M:Ψ↦(γ,β). Consequently,

y^=Gθ(x∣γ,β),(γ,β)=M(Ψ)\hat{\mathbf{y}}=G_{\mathbf{\theta}}(\mathbf{x} \mid \mathbf{\gamma}, \mathbf{\beta}), \quad(\mathbf{\gamma}, \mathbf{\beta})=\mathcal{M}(\Psi)y^​=Gθ​(x∣γ,β),(γ,β)=M(Ψ)

After obtaining (γ,β)(\mathbf{\gamma}, \mathbf{\beta})(γ,β) from conditions, the transformation is carried out by scaling and shifting feature maps of a specific layer:

SFT⁡(F∣γ,β)=γ⊙F+β\operatorname{SFT}(\mathbf{F} \mid \mathbf{\gamma}, \mathbf{\beta})=\mathbf{\gamma} \odot \mathbf{F}+\mathbf{\beta}SFT(F∣γ,β)=γ⊙F+β

where F\mathbf{F}F denotes the feature maps, whose dimension is the same as γ\gammaγ and β\mathbf{\beta}β, and ⊙\odot⊙ is referred to element-wise multiplication, i.e., Hadamard product. Since the spatial dimensions are preserved, the SFT layer not only performs feature-wise manipulation but also spatial-wise transformation.

Papers Using This Method

JAFAR: Jack up Any Feature at Any Resolution2025-06-10HeightLane: BEV Heightmap guided 3D Lane Detection2024-08-15Arc2Face: A Foundation Model for ID-Consistent Human Faces2024-03-18Boosting Cross-Quality Face Verification using Blind Face Restoration2023-08-15Variable-Rate Deep Image Compression through Spatially-Adaptive Feature Transform2021-08-21Towards Real-World Blind Face Restoration with Generative Facial Prior2021-01-11Blind Super-Resolution With Iterative Kernel Correction2019-04-06Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform2018-04-09