MDCS: More Diverse Experts with Consistency Self-distillation for Long-tailed Recognition

QiHao Zhao, Chen Jiang, Wei Hu, Fan Zhang, Jun Liu

2023-08-19ICCV 2023 1Long-tail Learning

Abstract

Recently, multi-expert methods have led to significant improvements in long-tail recognition (LTR). We summarize two aspects that need further enhancement to contribute to LTR boosting: (1) More diverse experts; (2) Lower model variance. However, the previous methods didn't handle them well. To this end, we propose More Diverse experts with Consistency Self-distillation (MDCS) to bridge the gap left by earlier methods. Our MDCS approach consists of two core components: Diversity Loss (DL) and Consistency Self-distillation (CS). In detail, DL promotes diversity among experts by controlling their focus on different categories. To reduce the model variance, we employ KL divergence to distill the richer knowledge of weakly augmented instances for the experts' self-distillation. In particular, we design Confident Instance Sampling (CIS) to select the correctly classified instances for CS to avoid biased/noisy knowledge. In the analysis and ablation study, we demonstrate that our method compared with previous work can effectively increase the diversity of experts, significantly reduce the variance of the model, and improve recognition accuracy. Moreover, the roles of our DL and CS are mutually reinforcing and coupled: the diversity of experts benefits from the CS, and the CS cannot achieve remarkable results without the DL. Experiments show our MDCS outperforms the state-of-the-art by 1% $\sim$ 2% on five popular long-tailed benchmarks, including CIFAR10-LT, CIFAR100-LT, ImageNet-LT, Places-LT, and iNaturalist 2018. The code is available at https://github.com/fistyee/MDCS.

Results

Task	Dataset	Metric	Value	Model
Image Classification	CIFAR-100-LT (ρ=50)	Error Rate	39.9	MDCS
Image Classification	ImageNet-LT	Top-1 Accuracy	61.8	MDCS (ResNeXt-50)
Image Classification	CIFAR-10-LT (ρ=50)	Error Rate	11.7	MDCS
Image Classification	CIFAR-100-LT (ρ=100)	Error Rate	43.9	MDCS
Few-Shot Image Classification	CIFAR-100-LT (ρ=50)	Error Rate	39.9	MDCS
Few-Shot Image Classification	ImageNet-LT	Top-1 Accuracy	61.8	MDCS (ResNeXt-50)
Few-Shot Image Classification	CIFAR-10-LT (ρ=50)	Error Rate	11.7	MDCS
Few-Shot Image Classification	CIFAR-100-LT (ρ=100)	Error Rate	43.9	MDCS
Generalized Few-Shot Classification	CIFAR-100-LT (ρ=50)	Error Rate	39.9	MDCS
Generalized Few-Shot Classification	ImageNet-LT	Top-1 Accuracy	61.8	MDCS (ResNeXt-50)
Generalized Few-Shot Classification	CIFAR-10-LT (ρ=50)	Error Rate	11.7	MDCS
Generalized Few-Shot Classification	CIFAR-100-LT (ρ=100)	Error Rate	43.9	MDCS
Long-tail Learning	CIFAR-100-LT (ρ=50)	Error Rate	39.9	MDCS
Long-tail Learning	ImageNet-LT	Top-1 Accuracy	61.8	MDCS (ResNeXt-50)
Long-tail Learning	CIFAR-10-LT (ρ=50)	Error Rate	11.7	MDCS
Long-tail Learning	CIFAR-100-LT (ρ=100)	Error Rate	43.9	MDCS
Generalized Few-Shot Learning	CIFAR-100-LT (ρ=50)	Error Rate	39.9	MDCS
Generalized Few-Shot Learning	ImageNet-LT	Top-1 Accuracy	61.8	MDCS (ResNeXt-50)
Generalized Few-Shot Learning	CIFAR-10-LT (ρ=50)	Error Rate	11.7	MDCS
Generalized Few-Shot Learning	CIFAR-100-LT (ρ=100)	Error Rate	43.9	MDCS

MDCS: More Diverse Experts with Consistency Self-distillation for Long-tailed Recognition

Abstract

Results

Related Papers

MDCS: More Diverse Experts with Consistency Self-distillation for Long-tailed Recognition

Abstract

Results

Related Papers