Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang song, Serge Belongie
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Classification | CIFAR-10-LT (ρ=10) | Error Rate | 12.9 | Class-balanced Focal Loss |
| Image Classification | CIFAR-10-LT (ρ=10) | Error Rate | 13.46 | Class-balanced Reweighting |
| Image Classification | COCO-MLT | Average mAP | 49.06 | CB Loss(ResNet-50) |
| Image Classification | CIFAR-100-LT (ρ=100) | Error Rate | 61.68 | Cross-Entropy (CE) |
| Image Classification | VOC-MLT | Average mAP | 75.24 | CB Focal(ResNet-50) |
| Image Classification | EGTEA | Average Precision | 63.39 | CB Loss |
| Image Classification | EGTEA | Average Recall | 63.26 | CB Loss |
| Few-Shot Image Classification | CIFAR-10-LT (ρ=10) | Error Rate | 12.9 | Class-balanced Focal Loss |
| Few-Shot Image Classification | CIFAR-10-LT (ρ=10) | Error Rate | 13.46 | Class-balanced Reweighting |
| Few-Shot Image Classification | COCO-MLT | Average mAP | 49.06 | CB Loss(ResNet-50) |
| Few-Shot Image Classification | CIFAR-100-LT (ρ=100) | Error Rate | 61.68 | Cross-Entropy (CE) |
| Few-Shot Image Classification | VOC-MLT | Average mAP | 75.24 | CB Focal(ResNet-50) |
| Few-Shot Image Classification | EGTEA | Average Precision | 63.39 | CB Loss |
| Few-Shot Image Classification | EGTEA | Average Recall | 63.26 | CB Loss |
| Generalized Few-Shot Classification | CIFAR-10-LT (ρ=10) | Error Rate | 12.9 | Class-balanced Focal Loss |
| Generalized Few-Shot Classification | CIFAR-10-LT (ρ=10) | Error Rate | 13.46 | Class-balanced Reweighting |
| Generalized Few-Shot Classification | COCO-MLT | Average mAP | 49.06 | CB Loss(ResNet-50) |
| Generalized Few-Shot Classification | CIFAR-100-LT (ρ=100) | Error Rate | 61.68 | Cross-Entropy (CE) |
| Generalized Few-Shot Classification | VOC-MLT | Average mAP | 75.24 | CB Focal(ResNet-50) |
| Generalized Few-Shot Classification | EGTEA | Average Precision | 63.39 | CB Loss |
| Generalized Few-Shot Classification | EGTEA | Average Recall | 63.26 | CB Loss |
| Long-tail Learning | CIFAR-10-LT (ρ=10) | Error Rate | 12.9 | Class-balanced Focal Loss |
| Long-tail Learning | CIFAR-10-LT (ρ=10) | Error Rate | 13.46 | Class-balanced Reweighting |
| Long-tail Learning | COCO-MLT | Average mAP | 49.06 | CB Loss(ResNet-50) |
| Long-tail Learning | CIFAR-100-LT (ρ=100) | Error Rate | 61.68 | Cross-Entropy (CE) |
| Long-tail Learning | VOC-MLT | Average mAP | 75.24 | CB Focal(ResNet-50) |
| Long-tail Learning | EGTEA | Average Precision | 63.39 | CB Loss |
| Long-tail Learning | EGTEA | Average Recall | 63.26 | CB Loss |
| Generalized Few-Shot Learning | CIFAR-10-LT (ρ=10) | Error Rate | 12.9 | Class-balanced Focal Loss |
| Generalized Few-Shot Learning | CIFAR-10-LT (ρ=10) | Error Rate | 13.46 | Class-balanced Reweighting |
| Generalized Few-Shot Learning | COCO-MLT | Average mAP | 49.06 | CB Loss(ResNet-50) |
| Generalized Few-Shot Learning | CIFAR-100-LT (ρ=100) | Error Rate | 61.68 | Cross-Entropy (CE) |
| Generalized Few-Shot Learning | VOC-MLT | Average mAP | 75.24 | CB Focal(ResNet-50) |
| Generalized Few-Shot Learning | EGTEA | Average Precision | 63.39 | CB Loss |
| Generalized Few-Shot Learning | EGTEA | Average Recall | 63.26 | CB Loss |