TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/RevNet

RevNet

Computer VisionIntroduced 20009 papers
Source Paper

Description

A Reversible Residual Network, or RevNet, is a variant of a ResNet where each layer’s activations can be reconstructed exactly from the next layer’s. Therefore, the activations for most layers need not be stored in memory during backpropagation. The result is a network architecture whose activation storage requirements are independent of depth, and typically at least an order of magnitude smaller compared with equally sized ResNets.

RevNets are composed of a series of reversible blocks. Units in each layer are partitioned into two groups, denoted x_1x\_{1}x_1 and x_2x\_{2}x_2; the authors find what works best is partitioning the channels. Each reversible block takes inputs (x_1,x_2)\left(x\_{1}, x\_{2}\right)(x_1,x_2) and produces outputs (y_1,y_2)\left(y\_{1}, y\_{2}\right)(y_1,y_2) according to the following additive coupling rules – inspired the transformation in NICE (nonlinear independent components estimation) – and residual functions FFF and GGG analogous to those in standard ResNets:

y_1=x_1+F(x_2)y\_{1} = x\_{1} + F\left(x\_{2}\right)y_1=x_1+F(x_2) y_2=x_2+G(y_1)y\_{2} = x\_{2} + G\left(y\_{1}\right)y_2=x_2+G(y_1)

Each layer’s activations can be reconstructed from the next layer’s activations as follows:

x_2=y_2−G(y_1) x\_{2} = y\_{2} − G\left(y\_{1}\right)x_2=y_2−G(y_1) x_1=y_1−F(x_2) x\_{1} = y\_{1} − F\left(x\_{2}\right)x_1=y_1−F(x_2)

Note that unlike residual blocks, reversible blocks must have a stride of 1 because otherwise the layer discards information, and therefore cannot be reversible. Standard ResNet architectures typically have a handful of layers with a larger stride. If we define a RevNet architecture analogously, the activations must be stored explicitly for all non-reversible layers.

Papers Using This Method

Diffusion Models Beat GANs on Image Classification2023-07-17Conditional Injective Flows for Bayesian Imaging2022-04-15Level set learning with pseudo-reversible neural networks for nonlinear dimension reduction in function approximation2021-12-02Multi-split Reversible Transformers Can Enhance Neural Machine Translation2021-04-01Object Segmentation Without Labels with Large-Scale Generative Models2020-06-08Reconstructing Natural Scenes from fMRI Patterns using BigBiGAN2020-01-31Large Scale Adversarial Representation Learning2019-07-04Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations2017-10-27The Reversible Residual Network: Backpropagation Without Storing Activations2017-07-14