Nested Collaborative Learning for Long-Tailed Visual Recognition

Jun Li, Zichang Tan, Jun Wan, Zhen Lei, Guodong Guo

2022-03-29CVPR 2022 1Image Classification Long-tail Learning

Abstract

The networks trained on the long-tailed dataset vary remarkably, despite the same training settings, which shows the great uncertainty in long-tailed learning. To alleviate the uncertainty, we propose a Nested Collaborative Learning (NCL), which tackles the problem by collaboratively learning multiple experts together. NCL consists of two core components, namely Nested Individual Learning (NIL) and Nested Balanced Online Distillation (NBOD), which focus on the individual supervised learning for each single expert and the knowledge transferring among multiple experts, respectively. To learn representations more thoroughly, both NIL and NBOD are formulated in a nested way, in which the learning is conducted on not just all categories from a full perspective but some hard categories from a partial perspective. Regarding the learning in the partial perspective, we specifically select the negative categories with high predicted scores as the hard categories by using a proposed Hard Category Mining (HCM). In the NCL, the learning from two perspectives is nested, highly related and complementary, and helps the network to capture not only global and robust features but also meticulous distinguishing ability. Moreover, self-supervision is further utilized for feature enhancement. Extensive experiments manifest the superiority of our method with outperforming the state-of-the-art whether by using a single model or an ensemble.

Results

Task	Dataset	Metric	Value	Model
Image Classification	Places-LT	Top-1 Accuracy	41.5	NCL(ResNet-152)
Image Classification	CIFAR-100-LT (ρ=50)	Error Rate	43.2	NCL(ResNet32)
Image Classification	ImageNet-LT	Top-1 Accuracy	58.4	NCL(ResNeXt-50)
Image Classification	ImageNet-LT	Top-1 Accuracy	57.4	NCL(ResNet-50)
Image Classification	CIFAR-10-LT (ρ=50)	Error Rate	13.2	NCL(ResNet32)
Image Classification	CIFAR-100-LT (ρ=100)	Error Rate	46.7	NCL(ResNet32)
Image Classification	CIFAR-10-LT (ρ=100)	Error Rate	15.3	NCL(ResNet32)
Few-Shot Image Classification	Places-LT	Top-1 Accuracy	41.5	NCL(ResNet-152)
Few-Shot Image Classification	CIFAR-100-LT (ρ=50)	Error Rate	43.2	NCL(ResNet32)
Few-Shot Image Classification	ImageNet-LT	Top-1 Accuracy	58.4	NCL(ResNeXt-50)
Few-Shot Image Classification	ImageNet-LT	Top-1 Accuracy	57.4	NCL(ResNet-50)
Few-Shot Image Classification	CIFAR-10-LT (ρ=50)	Error Rate	13.2	NCL(ResNet32)
Few-Shot Image Classification	CIFAR-100-LT (ρ=100)	Error Rate	46.7	NCL(ResNet32)
Few-Shot Image Classification	CIFAR-10-LT (ρ=100)	Error Rate	15.3	NCL(ResNet32)
Generalized Few-Shot Classification	Places-LT	Top-1 Accuracy	41.5	NCL(ResNet-152)
Generalized Few-Shot Classification	CIFAR-100-LT (ρ=50)	Error Rate	43.2	NCL(ResNet32)
Generalized Few-Shot Classification	ImageNet-LT	Top-1 Accuracy	58.4	NCL(ResNeXt-50)
Generalized Few-Shot Classification	ImageNet-LT	Top-1 Accuracy	57.4	NCL(ResNet-50)
Generalized Few-Shot Classification	CIFAR-10-LT (ρ=50)	Error Rate	13.2	NCL(ResNet32)
Generalized Few-Shot Classification	CIFAR-100-LT (ρ=100)	Error Rate	46.7	NCL(ResNet32)
Generalized Few-Shot Classification	CIFAR-10-LT (ρ=100)	Error Rate	15.3	NCL(ResNet32)
Long-tail Learning	Places-LT	Top-1 Accuracy	41.5	NCL(ResNet-152)
Long-tail Learning	CIFAR-100-LT (ρ=50)	Error Rate	43.2	NCL(ResNet32)
Long-tail Learning	ImageNet-LT	Top-1 Accuracy	58.4	NCL(ResNeXt-50)
Long-tail Learning	ImageNet-LT	Top-1 Accuracy	57.4	NCL(ResNet-50)
Long-tail Learning	CIFAR-10-LT (ρ=50)	Error Rate	13.2	NCL(ResNet32)
Long-tail Learning	CIFAR-100-LT (ρ=100)	Error Rate	46.7	NCL(ResNet32)
Long-tail Learning	CIFAR-10-LT (ρ=100)	Error Rate	15.3	NCL(ResNet32)
Generalized Few-Shot Learning	Places-LT	Top-1 Accuracy	41.5	NCL(ResNet-152)
Generalized Few-Shot Learning	CIFAR-100-LT (ρ=50)	Error Rate	43.2	NCL(ResNet32)
Generalized Few-Shot Learning	ImageNet-LT	Top-1 Accuracy	58.4	NCL(ResNeXt-50)
Generalized Few-Shot Learning	ImageNet-LT	Top-1 Accuracy	57.4	NCL(ResNet-50)
Generalized Few-Shot Learning	CIFAR-10-LT (ρ=50)	Error Rate	13.2	NCL(ResNet32)
Generalized Few-Shot Learning	CIFAR-100-LT (ρ=100)	Error Rate	46.7	NCL(ResNet32)
Generalized Few-Shot Learning	CIFAR-10-LT (ρ=100)	Error Rate	15.3	NCL(ResNet32)

Nested Collaborative Learning for Long-Tailed Visual Recognition

Abstract

Results

Related Papers

Nested Collaborative Learning for Long-Tailed Visual Recognition

Abstract

Results

Related Papers