Shuang Li, Kaixiong Gong, Chi Harold Liu, Yulin Wang, Feng Qiao, Xinjing Cheng
Real-world training data usually exhibits long-tailed distribution, where several majority classes have a significantly larger number of samples than the remaining minority classes. This imbalance degrades the performance of typical supervised learning algorithms designed for balanced training sets. In this paper, we address this issue by augmenting minority classes with a recently proposed implicit semantic data augmentation (ISDA) algorithm, which produces diversified augmented samples by translating deep features along many semantically meaningful directions. Importantly, given that ISDA estimates the class-conditional statistics to obtain semantic directions, we find it ineffective to do this on minority classes due to the insufficient training data. To this end, we propose a novel approach to learn transformed semantic directions with meta-learning automatically. In specific, the augmentation strategy during training is dynamically optimized, aiming to minimize the loss on a small balanced validation set, which is approximated via a meta update step. Extensive empirical results on CIFAR-LT-10/100, ImageNet-LT, and iNaturalist 2017/2018 validate the effectiveness of our method.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Classification | CIFAR-100-LT (ρ=200) | Error Rate | 56.91 | MetaSAug-LDAM |
| Image Classification | CIFAR-10-LT (ρ=10) | Error Rate | 10.32 | MetaSAug-LDAM |
| Image Classification | CIFAR-100-LT (ρ=50) | Error Rate | 47.73 | MetaSAug-LDAM |
| Image Classification | CIFAR-100-LT (ρ=10) | Error Rate | 38.72 | MetaSAug-LDAM |
| Image Classification | ImageNet-LT | Top-1 Accuracy | 50.03 | MetaSAug (ResNet-152) |
| Image Classification | ImageNet-LT | Top-1 Accuracy | 47.39 | MetaSAug with CE loss |
| Image Classification | CIFAR-10-LT (ρ=50) | Error Rate | 15.66 | MetaSAug-LDAM |
| Image Classification | CIFAR-100-LT (ρ=100) | Error Rate | 51.99 | MetaSAug-LDAM |
| Image Classification | CIFAR-10-LT (ρ=100) | Error Rate | 19.34 | MetaSAug-LDAM |
| Image Classification | CIFAR-10-LT (ρ=200) | Error Rate | 22.65 | MetaSAug-LDAM |
| Few-Shot Image Classification | CIFAR-100-LT (ρ=200) | Error Rate | 56.91 | MetaSAug-LDAM |
| Few-Shot Image Classification | CIFAR-10-LT (ρ=10) | Error Rate | 10.32 | MetaSAug-LDAM |
| Few-Shot Image Classification | CIFAR-100-LT (ρ=50) | Error Rate | 47.73 | MetaSAug-LDAM |
| Few-Shot Image Classification | CIFAR-100-LT (ρ=10) | Error Rate | 38.72 | MetaSAug-LDAM |
| Few-Shot Image Classification | ImageNet-LT | Top-1 Accuracy | 50.03 | MetaSAug (ResNet-152) |
| Few-Shot Image Classification | ImageNet-LT | Top-1 Accuracy | 47.39 | MetaSAug with CE loss |
| Few-Shot Image Classification | CIFAR-10-LT (ρ=50) | Error Rate | 15.66 | MetaSAug-LDAM |
| Few-Shot Image Classification | CIFAR-100-LT (ρ=100) | Error Rate | 51.99 | MetaSAug-LDAM |
| Few-Shot Image Classification | CIFAR-10-LT (ρ=100) | Error Rate | 19.34 | MetaSAug-LDAM |
| Few-Shot Image Classification | CIFAR-10-LT (ρ=200) | Error Rate | 22.65 | MetaSAug-LDAM |
| Generalized Few-Shot Classification | CIFAR-100-LT (ρ=200) | Error Rate | 56.91 | MetaSAug-LDAM |
| Generalized Few-Shot Classification | CIFAR-10-LT (ρ=10) | Error Rate | 10.32 | MetaSAug-LDAM |
| Generalized Few-Shot Classification | CIFAR-100-LT (ρ=50) | Error Rate | 47.73 | MetaSAug-LDAM |
| Generalized Few-Shot Classification | CIFAR-100-LT (ρ=10) | Error Rate | 38.72 | MetaSAug-LDAM |
| Generalized Few-Shot Classification | ImageNet-LT | Top-1 Accuracy | 50.03 | MetaSAug (ResNet-152) |
| Generalized Few-Shot Classification | ImageNet-LT | Top-1 Accuracy | 47.39 | MetaSAug with CE loss |
| Generalized Few-Shot Classification | CIFAR-10-LT (ρ=50) | Error Rate | 15.66 | MetaSAug-LDAM |
| Generalized Few-Shot Classification | CIFAR-100-LT (ρ=100) | Error Rate | 51.99 | MetaSAug-LDAM |
| Generalized Few-Shot Classification | CIFAR-10-LT (ρ=100) | Error Rate | 19.34 | MetaSAug-LDAM |
| Generalized Few-Shot Classification | CIFAR-10-LT (ρ=200) | Error Rate | 22.65 | MetaSAug-LDAM |
| Long-tail Learning | CIFAR-100-LT (ρ=200) | Error Rate | 56.91 | MetaSAug-LDAM |
| Long-tail Learning | CIFAR-10-LT (ρ=10) | Error Rate | 10.32 | MetaSAug-LDAM |
| Long-tail Learning | CIFAR-100-LT (ρ=50) | Error Rate | 47.73 | MetaSAug-LDAM |
| Long-tail Learning | CIFAR-100-LT (ρ=10) | Error Rate | 38.72 | MetaSAug-LDAM |
| Long-tail Learning | ImageNet-LT | Top-1 Accuracy | 50.03 | MetaSAug (ResNet-152) |
| Long-tail Learning | ImageNet-LT | Top-1 Accuracy | 47.39 | MetaSAug with CE loss |
| Long-tail Learning | CIFAR-10-LT (ρ=50) | Error Rate | 15.66 | MetaSAug-LDAM |
| Long-tail Learning | CIFAR-100-LT (ρ=100) | Error Rate | 51.99 | MetaSAug-LDAM |
| Long-tail Learning | CIFAR-10-LT (ρ=100) | Error Rate | 19.34 | MetaSAug-LDAM |
| Long-tail Learning | CIFAR-10-LT (ρ=200) | Error Rate | 22.65 | MetaSAug-LDAM |
| Generalized Few-Shot Learning | CIFAR-100-LT (ρ=200) | Error Rate | 56.91 | MetaSAug-LDAM |
| Generalized Few-Shot Learning | CIFAR-10-LT (ρ=10) | Error Rate | 10.32 | MetaSAug-LDAM |
| Generalized Few-Shot Learning | CIFAR-100-LT (ρ=50) | Error Rate | 47.73 | MetaSAug-LDAM |
| Generalized Few-Shot Learning | CIFAR-100-LT (ρ=10) | Error Rate | 38.72 | MetaSAug-LDAM |
| Generalized Few-Shot Learning | ImageNet-LT | Top-1 Accuracy | 50.03 | MetaSAug (ResNet-152) |
| Generalized Few-Shot Learning | ImageNet-LT | Top-1 Accuracy | 47.39 | MetaSAug with CE loss |
| Generalized Few-Shot Learning | CIFAR-10-LT (ρ=50) | Error Rate | 15.66 | MetaSAug-LDAM |
| Generalized Few-Shot Learning | CIFAR-100-LT (ρ=100) | Error Rate | 51.99 | MetaSAug-LDAM |
| Generalized Few-Shot Learning | CIFAR-10-LT (ρ=100) | Error Rate | 19.34 | MetaSAug-LDAM |
| Generalized Few-Shot Learning | CIFAR-10-LT (ρ=200) | Error Rate | 22.65 | MetaSAug-LDAM |