How does Disagreement Help Generalization against Label Corruption?

Xingrui Yu, Bo Han, Jiangchao Yao, Gang Niu, Ivor W. Tsang, Masashi Sugiyama

2019-01-14Learning with noisy labels Memorization

Abstract

Learning with noisy labels is one of the hottest problems in weakly-supervised learning. Based on memorization effects of deep neural networks, training on small-loss instances becomes very promising for handling noisy labels. This fosters the state-of-the-art approach "Co-teaching" that cross-trains two deep neural networks using the small-loss trick. However, with the increase of epochs, two networks converge to a consensus and Co-teaching reduces to the self-training MentorNet. To tackle this issue, we propose a robust learning paradigm called Co-teaching+, which bridges the "Update by Disagreement" strategy with the original Co-teaching. First, two networks feed forward and predict all data, but keep prediction disagreement data only. Then, among such disagreement data, each network selects its small-loss data, but back propagates the small-loss data from its peer network and updates its own parameters. Empirical results on benchmark datasets demonstrate that Co-teaching+ is much superior to many state-of-the-art methods in the robustness of trained models.

Results

Task	Dataset	Metric	Value	Model
Image Classification	CIFAR-10N-Random2	Accuracy (mean)	89.47	Co-Teaching+
Image Classification	CIFAR-10N-Random3	Accuracy (mean)	89.54	Co-Teaching+
Image Classification	CIFAR-10N-Aggregate	Accuracy (mean)	90.61	Co-Teaching+
Image Classification	CIFAR-10N-Random1	Accuracy (mean)	89.7	Co-Teaching+
Image Classification	CIFAR-100N	Accuracy (mean)	57.88	Co-Teaching+
Image Classification	CIFAR-10N-Worst	Accuracy (mean)	83.26	Co-Teaching+
Document Text Classification	CIFAR-10N-Random2	Accuracy (mean)	89.47	Co-Teaching+
Document Text Classification	CIFAR-10N-Random3	Accuracy (mean)	89.54	Co-Teaching+
Document Text Classification	CIFAR-10N-Aggregate	Accuracy (mean)	90.61	Co-Teaching+
Document Text Classification	CIFAR-10N-Random1	Accuracy (mean)	89.7	Co-Teaching+
Document Text Classification	CIFAR-100N	Accuracy (mean)	57.88	Co-Teaching+
Document Text Classification	CIFAR-10N-Worst	Accuracy (mean)	83.26	Co-Teaching+

How does Disagreement Help Generalization against Label Corruption?

Abstract

Results

Related Papers

How does Disagreement Help Generalization against Label Corruption?

Abstract

Results

Related Papers