LayerDrop

GeneralIntroduced 20004 papers

Description

LayerDrop is a form of structured dropout for Transformer models which has a regularization effect during training and allows for efficient pruning at inference time. It randomly drops layers from the Transformer according to an "every other" strategy where pruning with a rate pp means dropping the layers at depth dd such that d = 0\left\(\text{mod}\left(\text{floor}\left(\frac{1}{p}\right)\right)\right).

Papers Using This Method