TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Conditional Batch Normalization

Conditional Batch Normalization

GeneralIntroduced 2000145 papers
Source Paper

Description

Conditional Batch Normalization (CBN) is a class-conditional variant of batch normalization. The key idea is to predict the γ\gammaγ and β\betaβ of the batch normalization from an embedding - e.g. a language embedding in VQA. CBN enables the linguistic embedding to manipulate entire feature maps by scaling them up or down, negating them, or shutting them off. CBN has also been used in GANs to allow class information to affect the batch normalization parameters.

Consider a single convolutional layer with batch normalization module BN(F_i,c,h,w∣γ_c,β_c)\text{BN}\left(F\_{i,c,h,w}|\gamma\_{c}, \beta\_{c}\right)BN(F_i,c,h,w∣γ_c,β_c) for which pretrained scalars γ_c\gamma\_{c}γ_c and β_c\beta\_{c}β_c are available. We would like to directly predict these affine scaling parameters from, e.g., a language embedding e_q\mathbf{e\_{q}}e_q. When starting the training procedure, these parameters must be close to the pretrained values to recover the original ResNet model as a poor initialization could significantly deteriorate performance. Unfortunately, it is difficult to initialize a network to output the pretrained γ\gammaγ and β\betaβ. For these reasons, the authors propose to predict a change δβ_c\delta\beta\_{c}δβ_c and δγ_c\delta\gamma\_{c}δγ_c on the frozen original scalars, for which it is straightforward to initialize a neural network to produce an output with zero-mean and small variance.

The authors use a one-hidden-layer MLP to predict these deltas from a question embedding e_q\mathbf{e\_{q}}e_q for all feature maps within the layer:

Δβ=MLP(e_q)\Delta\beta = \text{MLP}\left(\mathbf{e\_{q}}\right)Δβ=MLP(e_q)

Δγ=MLP(e_q)\Delta\gamma = \text{MLP}\left(\mathbf{e\_{q}}\right)Δγ=MLP(e_q)

So, given a feature map with CCC channels, these MLPs output a vector of size CCC. We then add these predictions to the β\betaβ and γ\gammaγ parameters:

β^_c=β_c+Δβ_c\hat{\beta}\_{c} = \beta\_{c} + \Delta\beta\_{c}β^​_c=β_c+Δβ_c

γ^_c=γ_c+Δγ_c\hat{\gamma}\_{c} = \gamma\_{c} + \Delta\gamma\_{c}γ^​_c=γ_c+Δγ_c

Finally, these updated β^\hat{β}β^​ and γ^\hat{\gamma}γ^​ are used as parameters for the batch normalization: BN(F_i,c,h,w∣γ_c^,β_c^)\text{BN}\left(F\_{i,c,h,w}|\hat{\gamma\_{c}}, \hat{\beta\_{c}}\right)BN(F_i,c,h,w∣γ_c^​,β_c^​). The authors freeze all ResNet parameters, including γ\gammaγ and β\betaβ, during training. A ResNet consists of four stages of computation, each subdivided in several residual blocks. In each block, the authors apply CBN to the three convolutional layers.

Papers Using This Method

ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks2024-11-06Unsupervised Panoptic Interpretation of Latent Spaces in GANs Using Space-Filling Vector Quantization2024-10-27RATLIP: Generative Adversarial CLIP Text-to-Image Synthesis Based on Recurrent Affine Transformations2024-05-13Data-driven Crop Growth Simulation on Time-varying Generated Images using Multi-conditional Generative Adversarial Networks2023-12-06On quantifying and improving realism of images generated with diffusion2023-09-26Precision-Recall Divergence Optimization for Generative Modeling with GANs and Normalizing Flows2023-09-21A Strategic Framework for Optimal Decisions in Football 1-vs-1 Shot-Taking Situations: An Integrated Approach of Machine Learning, Theory-Based Modeling, and Game Theory2023-07-27Pyrus Base: An Open Source Python Framework for the RoboCup 2D Soccer Simulation2023-07-22Diffusion Models Beat GANs on Image Classification2023-07-17Diversity is Strength: Mastering Football Full Game with Interactive Reinforcement Learning of Multiple AIs2023-06-28Rosetta Neurons: Mining the Common Units in a Model Zoo2023-06-15Toward more accurate and generalizable brain deformation estimators for traumatic brain injury detection with unsupervised domain adaptation2023-06-08FOOCTTS: Generating Arabic Speech with Acoustic Environment for Football Commentator2023-06-07Action valuation of on- and off-ball soccer players based on multi-agent deep reinforcement learning2023-05-29Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?2023-05-27Adaptive action supervision in reinforcement learning from real-world multi-agent demonstrations2023-05-22An Empirical Study on Google Research Football Multi-agent Scenarios2023-05-16The MuSe 2023 Multimodal Sentiment Analysis Challenge: Mimicked Emotions, Cross-Cultural Humour, and Personalisation2023-05-05SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes2023-04-11VARS: Video Assistant Referee System for Automated Soccer Decision Making from Multiple Views2023-04-10