Junnan Li, Richard Socher, Steven C. H. Hoi
Deep neural networks are known to be annotation-hungry. Numerous efforts have been devoted to reducing the annotation cost when learning with deep networks. Two prominent directions include learning with noisy labels and semi-supervised learning by exploiting unlabeled data. In this work, we propose DivideMix, a novel framework for learning with noisy labels by leveraging semi-supervised learning techniques. In particular, DivideMix models the per-sample loss distribution with a mixture model to dynamically divide the training data into a labeled set with clean samples and an unlabeled set with noisy samples, and trains the model on both the labeled and unlabeled data in a semi-supervised manner. To avoid confirmation bias, we simultaneously train two diverged networks where each network uses the dataset division from the other network. During the semi-supervised training phase, we improve the MixMatch strategy by performing label co-refinement and label co-guessing on labeled and unlabeled samples, respectively. Experiments on multiple benchmark datasets demonstrate substantial improvements over state-of-the-art methods. Code is available at https://github.com/LiJunnan1992/DivideMix .
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Classification | mini WebVision 1.0 | ImageNet Top-1 Accuracy | 75.2 | DivideMix (Inception-ResNet-v2) |
| Image Classification | mini WebVision 1.0 | ImageNet Top-5 Accuracy | 91.64 | DivideMix (Inception-ResNet-v2) |
| Image Classification | mini WebVision 1.0 | Top-1 Accuracy | 77.32 | DivideMix (Inception-ResNet-v2) |
| Image Classification | mini WebVision 1.0 | Top-5 Accuracy | 91.64 | DivideMix (Inception-ResNet-v2) |
| Image Classification | mini WebVision 1.0 | Top-1 Accuracy | 76.08 | DivideMix (ResNet-18) |
| Image Classification | CIFAR-10N-Random2 | Accuracy (mean) | 90.9 | Divide-Mix |
| Image Classification | CIFAR-10N-Random3 | Accuracy (mean) | 89.97 | Divide-Mix |
| Image Classification | CIFAR-10N-Aggregate | Accuracy (mean) | 95.01 | Divide-Mix |
| Image Classification | CIFAR-10N-Random1 | Accuracy (mean) | 90.18 | Divide-Mix |
| Image Classification | CIFAR-100N | Accuracy (mean) | 71.13 | Divide-Mix |
| Image Classification | CIFAR-10N-Worst | Accuracy (mean) | 92.56 | Divide-Mix |
| Document Text Classification | CIFAR-10N-Random2 | Accuracy (mean) | 90.9 | Divide-Mix |
| Document Text Classification | CIFAR-10N-Random3 | Accuracy (mean) | 89.97 | Divide-Mix |
| Document Text Classification | CIFAR-10N-Aggregate | Accuracy (mean) | 95.01 | Divide-Mix |
| Document Text Classification | CIFAR-10N-Random1 | Accuracy (mean) | 90.18 | Divide-Mix |
| Document Text Classification | CIFAR-100N | Accuracy (mean) | 71.13 | Divide-Mix |
| Document Text Classification | CIFAR-10N-Worst | Accuracy (mean) | 92.56 | Divide-Mix |