Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, Tengyu Ma
Deep learning algorithms can fare poorly when the training dataset suffers from heavy class-imbalance but the testing criterion requires good generalization on less frequent classes. We design two novel methods to improve performance in such scenarios. First, we propose a theoretically-principled label-distribution-aware margin (LDAM) loss motivated by minimizing a margin-based generalization bound. This loss replaces the standard cross-entropy objective during training and can be applied with prior strategies for training with class-imbalance such as re-weighting or re-sampling. Second, we propose a simple, yet effective, training schedule that defers re-weighting until after the initial stage, allowing the model to learn an initial representation while avoiding some of the complications associated with re-weighting or re-sampling. We test our methods on several benchmark vision tasks including the real-world imbalanced dataset iNaturalist 2018. Our experiments show that either of these methods alone can already improve over existing techniques and their combination achieves even better performance gains.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Classification | CIFAR-10-LT (ρ=10) | Error Rate | 11.84 | LDAM-DRW |
| Image Classification | CIFAR-10-LT (ρ=10) | Error Rate | 13.21 | Class-balanced Resampling |
| Image Classification | CIFAR-10-LT (ρ=10) | Error Rate | 13.61 | Empirical Risk Minimization (ERM, CE) |
| Image Classification | CIFAR-100-LT (ρ=10) | Error Rate | 41.29 | LDAM-DRW |
| Image Classification | COCO-MLT | Average mAP | 40.53 | LDAM(ResNet-50) |
| Image Classification | CIFAR-100-LT (ρ=100) | Error Rate | 57.96 | LDAM-DRW |
| Image Classification | VOC-MLT | Average mAP | 70.73 | LDAM(ResNet-50) |
| Image Classification | CIFAR-10-LT (ρ=100) | Error Rate | 22.97 | LDAM-DRW |
| Image Classification | CUB-LT | Long-Tailed Accuracy | 64.1 | LDAM |
| Image Classification | CUB-LT | Per-Class Accuracy | 50.1 | LDAM |
| Image Classification | AWA-LT | Long-Tailed Accuracy | 93.5 | LDAM |
| Image Classification | AWA-LT | Per-Class Accuracy | 69.1 | LDAM |
| Image Classification | SUN-LT | Long-Tailed Accuracy | 36.4 | LDAM |
| Image Classification | SUN-LT | Per-Class Accuracy | 29.8 | LDAM |
| Few-Shot Image Classification | CIFAR-10-LT (ρ=10) | Error Rate | 11.84 | LDAM-DRW |
| Few-Shot Image Classification | CIFAR-10-LT (ρ=10) | Error Rate | 13.21 | Class-balanced Resampling |
| Few-Shot Image Classification | CIFAR-10-LT (ρ=10) | Error Rate | 13.61 | Empirical Risk Minimization (ERM, CE) |
| Few-Shot Image Classification | CIFAR-100-LT (ρ=10) | Error Rate | 41.29 | LDAM-DRW |
| Few-Shot Image Classification | COCO-MLT | Average mAP | 40.53 | LDAM(ResNet-50) |
| Few-Shot Image Classification | CIFAR-100-LT (ρ=100) | Error Rate | 57.96 | LDAM-DRW |
| Few-Shot Image Classification | VOC-MLT | Average mAP | 70.73 | LDAM(ResNet-50) |
| Few-Shot Image Classification | CIFAR-10-LT (ρ=100) | Error Rate | 22.97 | LDAM-DRW |
| Few-Shot Image Classification | CUB-LT | Long-Tailed Accuracy | 64.1 | LDAM |
| Few-Shot Image Classification | CUB-LT | Per-Class Accuracy | 50.1 | LDAM |
| Few-Shot Image Classification | AWA-LT | Long-Tailed Accuracy | 93.5 | LDAM |
| Few-Shot Image Classification | AWA-LT | Per-Class Accuracy | 69.1 | LDAM |
| Few-Shot Image Classification | SUN-LT | Long-Tailed Accuracy | 36.4 | LDAM |
| Few-Shot Image Classification | SUN-LT | Per-Class Accuracy | 29.8 | LDAM |
| Generalized Few-Shot Classification | CIFAR-10-LT (ρ=10) | Error Rate | 11.84 | LDAM-DRW |
| Generalized Few-Shot Classification | CIFAR-10-LT (ρ=10) | Error Rate | 13.21 | Class-balanced Resampling |
| Generalized Few-Shot Classification | CIFAR-10-LT (ρ=10) | Error Rate | 13.61 | Empirical Risk Minimization (ERM, CE) |
| Generalized Few-Shot Classification | CIFAR-100-LT (ρ=10) | Error Rate | 41.29 | LDAM-DRW |
| Generalized Few-Shot Classification | COCO-MLT | Average mAP | 40.53 | LDAM(ResNet-50) |
| Generalized Few-Shot Classification | CIFAR-100-LT (ρ=100) | Error Rate | 57.96 | LDAM-DRW |
| Generalized Few-Shot Classification | VOC-MLT | Average mAP | 70.73 | LDAM(ResNet-50) |
| Generalized Few-Shot Classification | CIFAR-10-LT (ρ=100) | Error Rate | 22.97 | LDAM-DRW |
| Generalized Few-Shot Classification | CUB-LT | Long-Tailed Accuracy | 64.1 | LDAM |
| Generalized Few-Shot Classification | CUB-LT | Per-Class Accuracy | 50.1 | LDAM |
| Generalized Few-Shot Classification | AWA-LT | Long-Tailed Accuracy | 93.5 | LDAM |
| Generalized Few-Shot Classification | AWA-LT | Per-Class Accuracy | 69.1 | LDAM |
| Generalized Few-Shot Classification | SUN-LT | Long-Tailed Accuracy | 36.4 | LDAM |
| Generalized Few-Shot Classification | SUN-LT | Per-Class Accuracy | 29.8 | LDAM |
| Long-tail Learning | CIFAR-10-LT (ρ=10) | Error Rate | 11.84 | LDAM-DRW |
| Long-tail Learning | CIFAR-10-LT (ρ=10) | Error Rate | 13.21 | Class-balanced Resampling |
| Long-tail Learning | CIFAR-10-LT (ρ=10) | Error Rate | 13.61 | Empirical Risk Minimization (ERM, CE) |
| Long-tail Learning | CIFAR-100-LT (ρ=10) | Error Rate | 41.29 | LDAM-DRW |
| Long-tail Learning | COCO-MLT | Average mAP | 40.53 | LDAM(ResNet-50) |
| Long-tail Learning | CIFAR-100-LT (ρ=100) | Error Rate | 57.96 | LDAM-DRW |
| Long-tail Learning | VOC-MLT | Average mAP | 70.73 | LDAM(ResNet-50) |
| Long-tail Learning | CIFAR-10-LT (ρ=100) | Error Rate | 22.97 | LDAM-DRW |
| Long-tail Learning | CUB-LT | Long-Tailed Accuracy | 64.1 | LDAM |
| Long-tail Learning | CUB-LT | Per-Class Accuracy | 50.1 | LDAM |
| Long-tail Learning | AWA-LT | Long-Tailed Accuracy | 93.5 | LDAM |
| Long-tail Learning | AWA-LT | Per-Class Accuracy | 69.1 | LDAM |
| Long-tail Learning | SUN-LT | Long-Tailed Accuracy | 36.4 | LDAM |
| Long-tail Learning | SUN-LT | Per-Class Accuracy | 29.8 | LDAM |
| Generalized Few-Shot Learning | CIFAR-10-LT (ρ=10) | Error Rate | 11.84 | LDAM-DRW |
| Generalized Few-Shot Learning | CIFAR-10-LT (ρ=10) | Error Rate | 13.21 | Class-balanced Resampling |
| Generalized Few-Shot Learning | CIFAR-10-LT (ρ=10) | Error Rate | 13.61 | Empirical Risk Minimization (ERM, CE) |
| Generalized Few-Shot Learning | CIFAR-100-LT (ρ=10) | Error Rate | 41.29 | LDAM-DRW |
| Generalized Few-Shot Learning | COCO-MLT | Average mAP | 40.53 | LDAM(ResNet-50) |
| Generalized Few-Shot Learning | CIFAR-100-LT (ρ=100) | Error Rate | 57.96 | LDAM-DRW |
| Generalized Few-Shot Learning | VOC-MLT | Average mAP | 70.73 | LDAM(ResNet-50) |
| Generalized Few-Shot Learning | CIFAR-10-LT (ρ=100) | Error Rate | 22.97 | LDAM-DRW |
| Generalized Few-Shot Learning | CUB-LT | Long-Tailed Accuracy | 64.1 | LDAM |
| Generalized Few-Shot Learning | CUB-LT | Per-Class Accuracy | 50.1 | LDAM |
| Generalized Few-Shot Learning | AWA-LT | Long-Tailed Accuracy | 93.5 | LDAM |
| Generalized Few-Shot Learning | AWA-LT | Per-Class Accuracy | 69.1 | LDAM |
| Generalized Few-Shot Learning | SUN-LT | Long-Tailed Accuracy | 36.4 | LDAM |
| Generalized Few-Shot Learning | SUN-LT | Per-Class Accuracy | 29.8 | LDAM |