TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/ShakeDrop

ShakeDrop

GeneralIntroduced 20003 papers
Source Paper

Description

ShakeDrop regularization extends Shake-Shake regularization and can be applied not only to ResNeXt but also ResNet, WideResNet, and PyramidNet. The proposed ShakeDrop is given as

G(x)=x+(b_l+α−b_lα)F(x), in train-fwdG\left(x\right) = x + \left(b\_{l} + \alpha − b\_{l}\alpha\right)F\left(x\right), \text{ in train-fwd} G(x)=x+(b_l+α−b_lα)F(x), in train-fwd G(x)=x+(b_l+β−b_lβ)F(x), in train-bwdG\left(x\right) = x + \left(b\_{l} + \beta − b\_{l}\beta\right)F\left(x\right), \text{ in train-bwd} G(x)=x+(b_l+β−b_lβ)F(x), in train-bwd G(x)=x+E[b_l+α−b_lα]F(x), in testG\left(x\right) = x + E\left[b\_{l} + \alpha − b\_{l}\alpha\right]F\left(x\right), \text{ in test} G(x)=x+E[b_l+α−b_lα]F(x), in test

where b_lb\_{l}b_l is a Bernoulli random variable with probability P(b_l=1)=E[b_l]=p_lP\left(b\_{l} = 1\right) = E\left[b\_{l} \right] = p\_{l}P(b_l=1)=E[b_l]=p_l given by the linear decay rule in each layer, and α\alphaα and β\betaβ are independent uniform random variables in each element.

The most effective ranges of α\alphaα and β\betaβ were experimentally found to be different from those of Shake-Shake, and are α\alphaα = 0, β∈[0,1]\beta \in \left[0, 1\right]β∈[0,1] and α∈[−1,1]\alpha \in \left[−1, 1\right]α∈[−1,1], β∈[0,1]\beta \in \left[0, 1\right]β∈[0,1].

Papers Using This Method

Circumventing Outliers of AutoAugment with Knowledge Distillation2020-03-25RandAugment: Practical automated data augmentation with a reduced search space2019-09-30ShakeDrop Regularization for Deep Residual Learning2018-02-07