TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Stochastic Depth

Stochastic Depth

GeneralIntroduced 2000463 papers
Source Paper

Description

Stochastic Depth aims to shrink the depth of a network during training, while keeping it unchanged during testing. This is achieved by randomly dropping entire ResBlocks during training and bypassing their transformations through skip connections.

Let b_l∈b\_{l} \inb_l∈ {0,10, 10,1} denote a Bernoulli random variable, which indicates whether the lllth ResBlock is active (b_l=1b\_{l} = 1b_l=1) or inactive (b_l=0b\_{l} = 0b_l=0). Further, let us denote the “survival” probability of ResBlock lll as p_l=Pr(b_l=1)p\_{l} = \text{Pr}\left(b\_{l} = 1\right)p_l=Pr(b_l=1). With this definition we can bypass the lllth ResBlock by multiplying its function f_lf\_{l}f_l with b_lb\_{l}b_l and we extend the update rule to:

H_l=ReLU(b_lf_l(H_l−1)+id(H_l−1))H\_{l} = \text{ReLU}\left(b\_{l}f\_{l}\left(H\_{l-1}\right) + \text{id}\left(H\_{l-1}\right)\right)H_l=ReLU(b_lf_l(H_l−1)+id(H_l−1))

If b_l=1b\_{l} = 1b_l=1, this reduces to the original ResNet update and this ResBlock remains unchanged. If b_l=0b\_{l} = 0b_l=0, the ResBlock reduces to the identity function, H_l=id((H_l−1)H\_{l} = \text{id}\left((H\_{l}−1\right)H_l=id((H_l−1).

Papers Using This Method

MD-ViSCo: A Unified Model for Multi-Directional Vital Sign Waveform Conversion2025-06-10MedMoE: Modality-Specialized Mixture of Experts for Medical Vision-Language Understanding2025-06-10Multi-modal brain MRI synthesis based on SwinUNETR2025-06-03SST: Self-training with Self-adaptive Thresholding for Semi-supervised Learning2025-05-31ZIPA: A family of efficient models for multilingual phone recognition2025-05-29Deep Modeling and Optimization of Medical Image Classification2025-05-29AgriFM: A Multi-source Temporal Remote Sensing Foundation Model for Crop Mapping2025-05-27Structured Initialization for Vision Transformers2025-05-26Leveraging Stochastic Depth Training for Adaptive Inference2025-05-23Explainable Anatomy-Guided AI for Prostate MRI: Foundation Models and In Silico Clinical Trials for Virtual Biopsy-based Risk Assessment2025-05-23Swin Transformer for Robust CGI Images Detection: Intra- and Inter-Dataset Analysis across Multiple Color Spaces2025-05-22Fusion of Foundation and Vision Transformer Model Features for Dermatoscopic Image Classification2025-05-22Multi-Channel Swin Transformer Framework for Bearing Remaining Useful Life Prediction2025-05-20CheX-DS: Improving Chest X-ray Image Classification with Ensemble Learning Based on DenseNet and Swin Transformer2025-05-16A Deep Learning-Driven Inhalation Injury Grading Assistant Using Bronchoscopy Images2025-05-13Technical Report for ICRA 2025 GOOSE 2D Semantic Segmentation Challenge: Leveraging Color Shift Correction, RoPE-Swin Backbone, and Quantile-based Label Denoising Strategy for Robust Outdoor Scene Understanding2025-05-11DFEN: Dual Feature Equalization Network for Medical Image Segmentation2025-05-09Balancing Accuracy, Calibration, and Efficiency in Active Learning with Vision Transformers Under Label Noise2025-05-07SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer2025-05-07Enhancing DR Classification with Swin Transformer and Shifted Window Attention2025-04-20