TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Self-Adaptive Training: beyond Empirical Risk Minimization

Self-Adaptive Training: beyond Empirical Risk Minimization

Lang Huang, Chao Zhang, Hongyang Zhang

2020-02-24NeurIPS 2020 12General Classification
PaperPDFCodeCode(official)CodeCode

Abstract

We propose self-adaptive training---a new training algorithm that dynamically corrects problematic training labels by model predictions without incurring extra computational cost---to improve generalization of deep learning for potentially corrupted training data. This problem is crucial towards robustly learning from data that are corrupted by, e.g., label noises and out-of-distribution samples. The standard empirical risk minimization (ERM) for such data, however, may easily overfit noises and thus suffers from sub-optimal performance. In this paper, we observe that model predictions can substantially benefit the training process: self-adaptive training significantly improves generalization over ERM under various levels of noises, and mitigates the overfitting issue in both natural and adversarial training. We evaluate the error-capacity curve of self-adaptive training: the test error is monotonously decreasing w.r.t. model capacity. This is in sharp contrast to the recently-discovered double-descent phenomenon in ERM which might be a result of overfitting of noises. Experiments on CIFAR and ImageNet datasets verify the effectiveness of our approach in two applications: classification with label noise and selective classification. We release our code at https://github.com/LayneH/self-adaptive-training.

Related Papers

Specialized text classification: an approach to classifying Open Banking transactions2025-04-10Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy2025-01-13Revisiting MLLMs: An In-Depth Analysis of Image Classification Abilities2024-12-21Using Instruction-Tuned Large Language Models to Identify Indicators of Vulnerability in Police Incident Narratives2024-12-16Ramsey Theorems for Trees and a General 'Private Learning Implies Online Learning' Theorem2024-07-10Cross-Block Fine-Grained Semantic Cascade for Skeleton-Based Sports Action Recognition2024-04-30DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models2024-04-05Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency2024-02-24