TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Manifold Mixup

Manifold Mixup

GeneralIntroduced 200026 papers
Source Paper

Description

Manifold Mixup is a regularization method that encourages neural networks to predict less confidently on interpolations of hidden representations. It leverages semantic interpolations as an additional training signal, obtaining neural networks with smoother decision boundaries at multiple levels of representation. As a result, neural networks trained with Manifold Mixup learn class-representations with fewer directions of variance.

Consider training a deep neural network f(x)=f_k(g_k(x))f\left(x\right) = f\_{k}\left(g\_{k}\left(x\right)\right)f(x)=f_k(g_k(x)), where g_kg\_{k}g_k denotes the part of the neural network mapping the input data to the hidden representation at layer kkk, and f_kf\_{k}f_k denotes the part mapping such hidden representation to the output f(x)f\left(x\right)f(x). Training fff using Manifold Mixup is performed in five steps:

(1) Select a random layer kkk from a set of eligible layers SSS in the neural network. This set may include the input layer g_0(x)g\_{0}\left(x\right)g_0(x).

(2) Process two random data minibatches (x,y)\left(x, y\right)(x,y) and (x′,y′)\left(x', y'\right)(x′,y′) as usual, until reaching layer kkk. This provides us with two intermediate minibatches (g_k(x),y)\left(g\_{k}\left(x\right), y\right)(g_k(x),y) and (g_k(x′),y′)\left(g\_{k}\left(x'\right), y'\right)(g_k(x′),y′).

(3) Perform Input Mixup on these intermediate minibatches. This produces the mixed minibatch:

(g~_k,y~)=(Mix_λ(g_k(x),g_k(x′)),Mix_λ(y,y′)),\left(\tilde{g}\_{k}, \tilde{y}\right) = \left(\text{Mix}\_{\lambda}\left(g\_{k}\left(x\right), g\_{k}\left(x'\right)\right), \text{Mix}\_{\lambda}\left(y, y'\right )\right),(g~​_k,y~​)=(Mix_λ(g_k(x),g_k(x′)),Mix_λ(y,y′)),

where Mix_λ(a,b)=λ⋅a+(1−λ)⋅b\text{Mix}\_{\lambda}\left(a, b\right) = \lambda \cdot a + \left(1 − \lambda\right) \cdot bMix_λ(a,b)=λ⋅a+(1−λ)⋅b. Here, (y,y′)\left(y, y' \right)(y,y′) are one-hot labels, and the mixing coefficient λ∼Beta(α,α)\lambda \sim \text{Beta}\left(\alpha, \alpha\right)λ∼Beta(α,α) as in mixup. For instance, α=1.0\alpha = 1.0α=1.0 is equivalent to sampling λ∼U(0,1)\lambda \sim U\left(0, 1\right)λ∼U(0,1).

(4) Continue the forward pass in the network from layer kkk until the output using the mixed minibatch (g~_k,y~)\left(\tilde{g}\_{k}, \tilde{y}\right)(g~​_k,y~​).

(5) This output is used to compute the loss value and gradients that update all the parameters of the neural network.

Papers Using This Method

MEDUSA: A Multimodal Deep Fusion Multi-Stage Training Framework for Speech Emotion Recognition in Naturalistic Conditions2025-06-11A few-shot Label Unlearning in Vertical Federated Learning2024-10-14PreMix: Addressing Label Scarcity in Whole Slide Image Classification with Pre-trained Multiple Instance Learning Aggregators2024-08-02SynerMix: Synergistic Mixup Solution for Enhanced Intra-Class Cohesion and Inter-Class Separability in Image Classification2024-03-21Mixture of Mixups for Multi-label Classification of Rare Anuran Sounds2024-03-14Improved Automatic Diabetic Retinopathy Severity Classification Using Deep Multimodal Fusion of UWF-CFP and OCTA Images2023-10-03ShuffleMix: Improving Representations via Channel-Wise Shuffle of Interpolated Hidden States2023-05-30On the Effectiveness of Hybrid Pooling in Mixup-Based Graph Learning for Language Processing2022-10-06Set-based Meta-Interpolation for Few-Task Meta-Learning2022-05-20Enhancing Cross-lingual Transfer by Manifold Mixup2022-05-09Learning to Classify Open Intent via Soft Labeling and Manifold Mixup2022-04-16STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation2022-03-20Noisy Feature Mixup2021-10-05Coded-InvNet for Resilient Prediction Serving Systems2021-06-11Distance Metric-Based Learning with Interpolated Latent Features for Location Classification in Endoscopy Image and Video2021-03-15Robust Pollen Imagery Classification with Generative Modeling and Mixup Training2021-02-25Logit As Auxiliary Weak-supervision for More Reliable and Accurate Prediction2021-01-01Regularizing Recurrent Neural Networks via Sequence Mixup2020-11-27PointMixup: Augmentation for Point Clouds2020-08-14Remix: Rebalanced Mixup2020-07-08