TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Group Normalization

Group Normalization

GeneralIntroduced 200055 papers
Source Paper

Description

Group Normalization is a normalization layer that divides channels into groups and normalizes the features within each group. GN does not exploit the batch dimension, and its computation is independent of batch sizes. In the case where the group size is 1, it is equivalent to Instance Normalization.

As motivation for the method, many classical features like SIFT and HOG had group-wise features and involved group-wise normalization. For example, a HOG vector is the outcome of several spatial cells where each cell is represented by a normalized orientation histogram.

Formally, Group Normalization is defined as:

μ_i=1m∑_k∈S_ix_k\mu\_{i} = \frac{1}{m}\sum\_{k\in\mathcal{S}\_{i}}x\_{k}μ_i=m1​∑_k∈S_ix_k

σ2_i=1m∑_k∈S_i(x_k−μ_i)2\sigma^{2}\_{i} = \frac{1}{m}\sum\_{k\in\mathcal{S}\_{i}}\left(x\_{k}-\mu\_{i}\right)^{2}σ2_i=m1​∑_k∈S_i(x_k−μ_i)2

x^_i=x_i−μ_iσ2_i+ϵ\hat{x}\_{i} = \frac{x\_{i} - \mu\_{i}}{\sqrt{\sigma^{2}\_{i}+\epsilon}}x^_i=σ2_i+ϵ​x_i−μ_i​

Here xxx is the feature computed by a layer, and iii is an index. Formally, a Group Norm layer computes μ\muμ and σ\sigmaσ in a set S_i\mathcal{S}\_{i}S_i defined as: S_i=\mathcal{S}\_{i} = S_i={k∣k_N=i_N,⌊k_CC/G⌋=⌊I_CC/G⌋k \mid k\_{N} = i\_{N} ,\lfloor\frac{k\_{C}}{C/G}\rfloor = \lfloor\frac{I\_{C}}{C/G}\rfloor k∣k_N=i_N,⌊C/Gk_C​⌋=⌊C/GI_C​⌋}.

Here GGG is the number of groups, which is a pre-defined hyper-parameter (G=32G = 32G=32 by default). C/GC/GC/G is the number of channels per group. ⌊\lfloor⌊ is the floor operation, and the final term means that the indexes iii and kkk are in the same group of channels, assuming each group of channels are stored in a sequential order along the CCC axis.

Papers Using This Method

Dynamic Group Normalization: Spatio-Temporal Adaptation to Evolving Data Statistics2025-01-01Rethinking Normalization Strategies and Convolutional Kernels for Multimodal Image Fusion2024-11-15Unsupervised Adaptive Normalization2024-09-07Exploring the Efficacy of Group-Normalization in Deep Learning Models for Alzheimer's Disease Classification2024-04-01Training-Free Pretrained Model Merging2024-03-04ELA: Efficient Local Attention for Deep Convolutional Neural Networks2024-03-02On Sensitivity and Robustness of Normalization Schemes to Input Distribution Shifts in Automatic MR Image Diagnosis2023-06-23Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images2023-03-25Making Batch Normalization Great in Federated Deep Learning2023-03-12On the Ideal Number of Groups for Isometric Gradient Propagation2023-02-07MGTUNet: An new UNet for colon nuclei instance segmentation and quantification2022-10-20Kernel Normalized Convolutional Networks for Privacy-Preserving Machine Learning2022-09-30Training a universal instance segmentation network for live cell images of various cell types and imaging modalities2022-07-28Understanding and Improving Group Normalization2022-07-05Domain Adaptation and Active Learning for Fine-Grained Recognition in the Field of Biodiversity2021-10-22Exploring Heterogeneous Characteristics of Layers in ASR Models for More Efficient Training2021-10-08Scalable deeper graph neural networks for high-performance materials property prediction2021-09-25NanoBatch Privacy: Enabling fast Differentially Private learning on the IPU2021-09-24Benchmarking the Robustness of Instance Segmentation Models2021-09-02Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images2021-05-31