A Unified Generalization Analysis of Re-Weighting and Logit-Adjustment for Imbalanced Learning

Zitai Wang, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang

2023-10-07NeurIPS 2023 11

Abstract

Real-world datasets are typically imbalanced in the sense that only a few classes have numerous samples, while many classes are associated with only a few samples. As a result, a na\"ive ERM learning process will be biased towards the majority classes, making it difficult to generalize to the minority classes. To address this issue, one simple but effective approach is to modify the loss function to emphasize the learning on minority classes, such as re-weighting the losses or adjusting the logits via class-dependent terms. However, existing generalization analysis of such losses is still coarse-grained and fragmented, failing to explain some empirical results. To bridge this gap, we propose a novel technique named data-dependent contraction to capture how these modified losses handle different classes. On top of this technique, a fine-grained generalization bound is established for imbalanced learning, which helps reveal the mystery of re-weighting and logit-adjustment in a unified manner. Furthermore, a principled learning algorithm is developed based on the theoretical insights. Finally, the empirical results on benchmark datasets not only validate the theoretical results but also demonstrate the effectiveness of the proposed method.

Results

Task	Dataset	Metric	Value	Model
Image Classification	CIFAR-10-LT (ρ=10)	Error Rate	8.18	VS + ADRW + TLA
Image Classification	CIFAR-100-LT (ρ=10)	Error Rate	34.41	VS + ADRW + TLA
Image Classification	CIFAR-100-LT (ρ=100)	Error Rate	46.95	VS + ADRW + TLA
Image Classification	CIFAR-10-LT (ρ=100)	Error Rate	13.58	VS + ADRW + TLA
Few-Shot Image Classification	CIFAR-10-LT (ρ=10)	Error Rate	8.18	VS + ADRW + TLA
Few-Shot Image Classification	CIFAR-100-LT (ρ=10)	Error Rate	34.41	VS + ADRW + TLA
Few-Shot Image Classification	CIFAR-100-LT (ρ=100)	Error Rate	46.95	VS + ADRW + TLA
Few-Shot Image Classification	CIFAR-10-LT (ρ=100)	Error Rate	13.58	VS + ADRW + TLA
Generalized Few-Shot Classification	CIFAR-10-LT (ρ=10)	Error Rate	8.18	VS + ADRW + TLA
Generalized Few-Shot Classification	CIFAR-100-LT (ρ=10)	Error Rate	34.41	VS + ADRW + TLA
Generalized Few-Shot Classification	CIFAR-100-LT (ρ=100)	Error Rate	46.95	VS + ADRW + TLA
Generalized Few-Shot Classification	CIFAR-10-LT (ρ=100)	Error Rate	13.58	VS + ADRW + TLA
Long-tail Learning	CIFAR-10-LT (ρ=10)	Error Rate	8.18	VS + ADRW + TLA
Long-tail Learning	CIFAR-100-LT (ρ=10)	Error Rate	34.41	VS + ADRW + TLA
Long-tail Learning	CIFAR-100-LT (ρ=100)	Error Rate	46.95	VS + ADRW + TLA
Long-tail Learning	CIFAR-10-LT (ρ=100)	Error Rate	13.58	VS + ADRW + TLA
Generalized Few-Shot Learning	CIFAR-10-LT (ρ=10)	Error Rate	8.18	VS + ADRW + TLA
Generalized Few-Shot Learning	CIFAR-100-LT (ρ=10)	Error Rate	34.41	VS + ADRW + TLA
Generalized Few-Shot Learning	CIFAR-100-LT (ρ=100)	Error Rate	46.95	VS + ADRW + TLA
Generalized Few-Shot Learning	CIFAR-10-LT (ρ=100)	Error Rate	13.58	VS + ADRW + TLA

Abstract

Results

Task	Dataset	Metric	Value	Model
Image Classification	CIFAR-10-LT (ρ=10)	Error Rate	8.18	VS + ADRW + TLA
Image Classification	CIFAR-100-LT (ρ=10)	Error Rate	34.41	VS + ADRW + TLA
Image Classification	CIFAR-100-LT (ρ=100)	Error Rate	46.95	VS + ADRW + TLA
Image Classification	CIFAR-10-LT (ρ=100)	Error Rate	13.58	VS + ADRW + TLA
Few-Shot Image Classification	CIFAR-10-LT (ρ=10)	Error Rate	8.18	VS + ADRW + TLA
Few-Shot Image Classification	CIFAR-100-LT (ρ=10)	Error Rate	34.41	VS + ADRW + TLA
Few-Shot Image Classification	CIFAR-100-LT (ρ=100)	Error Rate	46.95	VS + ADRW + TLA
Few-Shot Image Classification	CIFAR-10-LT (ρ=100)	Error Rate	13.58	VS + ADRW + TLA
Generalized Few-Shot Classification	CIFAR-10-LT (ρ=10)	Error Rate	8.18	VS + ADRW + TLA
Generalized Few-Shot Classification	CIFAR-100-LT (ρ=10)	Error Rate	34.41	VS + ADRW + TLA
Generalized Few-Shot Classification	CIFAR-100-LT (ρ=100)	Error Rate	46.95	VS + ADRW + TLA
Generalized Few-Shot Classification	CIFAR-10-LT (ρ=100)	Error Rate	13.58	VS + ADRW + TLA
Long-tail Learning	CIFAR-10-LT (ρ=10)	Error Rate	8.18	VS + ADRW + TLA
Long-tail Learning	CIFAR-100-LT (ρ=10)	Error Rate	34.41	VS + ADRW + TLA
Long-tail Learning	CIFAR-100-LT (ρ=100)	Error Rate	46.95	VS + ADRW + TLA
Long-tail Learning	CIFAR-10-LT (ρ=100)	Error Rate	13.58	VS + ADRW + TLA
Generalized Few-Shot Learning	CIFAR-10-LT (ρ=10)	Error Rate	8.18	VS + ADRW + TLA
Generalized Few-Shot Learning	CIFAR-100-LT (ρ=10)	Error Rate	34.41	VS + ADRW + TLA
Generalized Few-Shot Learning	CIFAR-100-LT (ρ=100)	Error Rate	46.95	VS + ADRW + TLA
Generalized Few-Shot Learning	CIFAR-10-LT (ρ=100)	Error Rate	13.58	VS + ADRW + TLA