TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ProMix: Combating Label Noise via Maximizing Clean Sample ...

ProMix: Combating Label Noise via Maximizing Clean Sample Utility

Ruixuan Xiao, Yiwen Dong, Haobo Wang, Lei Feng, Runze Wu, Gang Chen, Junbo Zhao

2022-07-21Learning with noisy labels
PaperPDFCode(official)

Abstract

Learning with Noisy Labels (LNL) has become an appealing topic, as imperfectly annotated data are relatively cheaper to obtain. Recent state-of-the-art approaches employ specific selection mechanisms to separate clean and noisy samples and then apply Semi-Supervised Learning (SSL) techniques for improved performance. However, the selection step mostly provides a medium-sized and decent-enough clean subset, which overlooks a rich set of clean samples. To fulfill this, we propose a novel LNL framework ProMix that attempts to maximize the utility of clean samples for boosted performance. Key to our method, we propose a matched high confidence selection technique that selects those examples with high confidence scores and matched predictions with given labels to dynamically expand a base clean sample set. To overcome the potential side effect of excessive clean set selection procedure, we further devise a novel SSL framework that is able to train balanced and unbiased classifiers on the separated clean and noisy samples. Extensive experiments demonstrate that ProMix significantly advances the current state-of-the-art results on multiple benchmarks with different types and levels of noise. It achieves an average improvement of 2.48\% on the CIFAR-N dataset. The code is available at https://github.com/Justherozen/ProMix

Results

TaskDatasetMetricValueModel
Image ClassificationCIFAR-10N-AggregateAccuracy (mean)97.39ProMix
Image ClassificationCIFAR-10N-Random1Accuracy (mean)96.97ProMix
Image ClassificationCIFAR-100NAccuracy (mean)73.39ProMix
Image ClassificationCIFAR-10NAccuracy97.39ProMix
Image ClassificationCIFAR-10N-WorstAccuracy (mean)96.16ProMix
Document Text ClassificationCIFAR-10N-AggregateAccuracy (mean)97.39ProMix
Document Text ClassificationCIFAR-10N-Random1Accuracy (mean)96.97ProMix
Document Text ClassificationCIFAR-100NAccuracy (mean)73.39ProMix
Document Text ClassificationCIFAR-10NAccuracy97.39ProMix
Document Text ClassificationCIFAR-10N-WorstAccuracy (mean)96.16ProMix

Related Papers

CLID-MU: Cross-Layer Information Divergence Based Meta Update Strategy for Learning with Noisy Labels2025-07-16Recalling The Forgotten Class Memberships: Unlearned Models Can Be Noisy Labelers to Leak Privacy2025-06-24On the Role of Label Noise in the Feature Learning Process2025-05-25Detect and Correct: A Selective Noise Correction Method for Learning with Noisy Labels2025-05-19Exploring Video-Based Driver Activity Recognition under Noisy Labels2025-04-16Noise-Aware Generalization: Robustness to In-Domain Noise and Out-of-Domain Generalization2025-04-03Learning from Noisy Labels with Contrastive Co-Transformer2025-03-04Enhancing Sample Selection Against Label Noise by Cutting Mislabeled Easy Examples2025-02-12