Learning with Instance-Dependent Label Noise: A Sample Sieve Approach

Hao Cheng, Zhaowei Zhu, Xingyu Li, Yifei Gong, Xing Sun, Yang Liu

2020-10-05ICLR 2021 1Image Classification Image Classification with Label Noise Learning with noisy labels

Abstract

Human-annotated labels are often prone to noise, and the presence of such noise will degrade the performance of the resulting deep neural network (DNN) models. Much of the literature (with several recent exceptions) of learning with noisy labels focuses on the case when the label noise is independent of features. Practically, annotations errors tend to be instance-dependent and often depend on the difficulty levels of recognizing a certain task. Applying existing results from instance-independent settings would require a significant amount of estimation of noise rates. Therefore, providing theoretically rigorous solutions for learning with instance-dependent label noise remains a challenge. In this paper, we propose CORES$^{2}$ (COnfidence REgularized Sample Sieve), which progressively sieves out corrupted examples. The implementation of CORES$^{2}$ does not require specifying noise rates and yet we are able to provide theoretical guarantees of CORES$^{2}$ in filtering out the corrupted examples. This high-quality sample sieve allows us to treat clean examples and the corrupted ones separately in training a DNN solution, and such a separation is shown to be advantageous in the instance-dependent noise setting. We demonstrate the performance of CORES$^{2}$ on CIFAR10 and CIFAR100 datasets with synthetic instance-dependent label noise and Clothing1M with real-world human noise. As of independent interests, our sample sieve provides a generic machinery for anatomizing noisy datasets and provides a flexible interface for various robust training techniques to further improve the performance. Code is available at https://github.com/UCSC-REAL/cores.

Results

Task	Dataset	Metric	Value	Model
Image Classification	CIFAR-10N-Random2	Accuracy (mean)	94.88	CORES*
Image Classification	CIFAR-10N-Random2	Accuracy (mean)	89.91	CORES
Image Classification	CIFAR-10N-Random3	Accuracy (mean)	94.74	CORES*
Image Classification	CIFAR-10N-Random3	Accuracy (mean)	89.79	CORES
Image Classification	CIFAR-10N-Aggregate	Accuracy (mean)	95.25	CORES*
Image Classification	CIFAR-10N-Aggregate	Accuracy (mean)	91.23	CORES
Image Classification	CIFAR-10N-Random1	Accuracy (mean)	94.45	CORES*
Image Classification	CIFAR-10N-Random1	Accuracy (mean)	89.66	CORES
Image Classification	CIFAR-100N	Accuracy (mean)	61.15	CORES
Image Classification	CIFAR-100N	Accuracy (mean)	55.72	CORES*
Image Classification	CIFAR-10N-Worst	Accuracy (mean)	91.66	CORES*
Image Classification	CIFAR-10N-Worst	Accuracy (mean)	83.6	CORES
Document Text Classification	CIFAR-10N-Random2	Accuracy (mean)	94.88	CORES*
Document Text Classification	CIFAR-10N-Random2	Accuracy (mean)	89.91	CORES
Document Text Classification	CIFAR-10N-Random3	Accuracy (mean)	94.74	CORES*
Document Text Classification	CIFAR-10N-Random3	Accuracy (mean)	89.79	CORES
Document Text Classification	CIFAR-10N-Aggregate	Accuracy (mean)	95.25	CORES*
Document Text Classification	CIFAR-10N-Aggregate	Accuracy (mean)	91.23	CORES
Document Text Classification	CIFAR-10N-Random1	Accuracy (mean)	94.45	CORES*
Document Text Classification	CIFAR-10N-Random1	Accuracy (mean)	89.66	CORES
Document Text Classification	CIFAR-100N	Accuracy (mean)	61.15	CORES
Document Text Classification	CIFAR-100N	Accuracy (mean)	55.72	CORES*
Document Text Classification	CIFAR-10N-Worst	Accuracy (mean)	91.66	CORES*
Document Text Classification	CIFAR-10N-Worst	Accuracy (mean)	83.6	CORES

Learning with Instance-Dependent Label Noise: A Sample Sieve Approach

Abstract

Results

Related Papers

Learning with Instance-Dependent Label Noise: A Sample Sieve Approach

Abstract

Results

Related Papers