Disentangling Label Distribution for Long-tailed Visual Recognition

Youngkyu Hong, Seungju Han, Kwanghee Choi, Seokjun Seo, Beomsu Kim, Buru Chang

2020-12-01CVPR 2021 1Image Classification Long-tail Learning Prediction

Abstract

The current evaluation protocol of long-tailed visual recognition trains the classification model on the long-tailed source label distribution and evaluates its performance on the uniform target label distribution. Such protocol has questionable practicality since the target may also be long-tailed. Therefore, we formulate long-tailed visual recognition as a label shift problem where the target and source label distributions are different. One of the significant hurdles in dealing with the label shift problem is the entanglement between the source label distribution and the model prediction. In this paper, we focus on disentangling the source label distribution from the model prediction. We first introduce a simple but overlooked baseline method that matches the target label distribution by post-processing the model prediction trained by the cross-entropy loss and the Softmax function. Although this method surpasses state-of-the-art methods on benchmark datasets, it can be further improved by directly disentangling the source label distribution from the model prediction in the training phase. Thus, we propose a novel method, LAbel distribution DisEntangling (LADE) loss based on the optimal bound of Donsker-Varadhan representation. LADE achieves state-of-the-art performance on benchmark datasets such as CIFAR-100-LT, Places-LT, ImageNet-LT, and iNaturalist 2018. Moreover, LADE outperforms existing methods on various shifted target label distributions, showing the general adaptability of our proposed method.

Results

Task	Dataset	Metric	Value	Model
Image Classification	Places-LT	Top-1 Accuracy	38.8	LADE
Image Classification	CIFAR-10-LT (ρ=10)	Error Rate	11.22	LADE
Image Classification	CIFAR-100-LT (ρ=10)	Error Rate	38.3	LADE
Image Classification	ImageNet-LT	Top-1 Accuracy	53	LADE
Image Classification	CIFAR-100-LT (ρ=100)	Error Rate	54.6	LADE
Few-Shot Image Classification	Places-LT	Top-1 Accuracy	38.8	LADE
Few-Shot Image Classification	CIFAR-10-LT (ρ=10)	Error Rate	11.22	LADE
Few-Shot Image Classification	CIFAR-100-LT (ρ=10)	Error Rate	38.3	LADE
Few-Shot Image Classification	ImageNet-LT	Top-1 Accuracy	53	LADE
Few-Shot Image Classification	CIFAR-100-LT (ρ=100)	Error Rate	54.6	LADE
Generalized Few-Shot Classification	Places-LT	Top-1 Accuracy	38.8	LADE
Generalized Few-Shot Classification	CIFAR-10-LT (ρ=10)	Error Rate	11.22	LADE
Generalized Few-Shot Classification	CIFAR-100-LT (ρ=10)	Error Rate	38.3	LADE
Generalized Few-Shot Classification	ImageNet-LT	Top-1 Accuracy	53	LADE
Generalized Few-Shot Classification	CIFAR-100-LT (ρ=100)	Error Rate	54.6	LADE
Long-tail Learning	Places-LT	Top-1 Accuracy	38.8	LADE
Long-tail Learning	CIFAR-10-LT (ρ=10)	Error Rate	11.22	LADE
Long-tail Learning	CIFAR-100-LT (ρ=10)	Error Rate	38.3	LADE
Long-tail Learning	ImageNet-LT	Top-1 Accuracy	53	LADE
Long-tail Learning	CIFAR-100-LT (ρ=100)	Error Rate	54.6	LADE
Generalized Few-Shot Learning	Places-LT	Top-1 Accuracy	38.8	LADE
Generalized Few-Shot Learning	CIFAR-10-LT (ρ=10)	Error Rate	11.22	LADE
Generalized Few-Shot Learning	CIFAR-100-LT (ρ=10)	Error Rate	38.3	LADE
Generalized Few-Shot Learning	ImageNet-LT	Top-1 Accuracy	53	LADE
Generalized Few-Shot Learning	CIFAR-100-LT (ρ=100)	Error Rate	54.6	LADE

Disentangling Label Distribution for Long-tailed Visual Recognition

Abstract

Results

Related Papers

Disentangling Label Distribution for Long-tailed Visual Recognition

Abstract

Results

Related Papers