Knowledge Distillation on CIFAR-100

Metric: Top-1 Accuracy (%) (higher is better)

LeaderboardDataset

Loading chart...

Results

Sort:

#	Model↕	Top-1 Accuracy (%)▼	Extra Data	Paper	Date↕	Code
1	SRD (T:resnet-32x4, S:shufflenet-v2)	79.86	No	Understanding the Role of the Projector in Knowl...	2023-03-20	Code
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	78.76	No	Logit Standardization in Knowledge Distillation	2024-03-03	Code
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	78.6	No	MV-MR: multi-views and multi-representations for...	2023-03-21	Code
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	78.28	No	Logit Standardization in Knowledge Distillation	2024-03-03	Code
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	78.08	No	Knowledge Distillation with the Reused Teacher C...	2022-03-26	Code
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	77.93	No	Improving Knowledge Distillation via Regularizin...	2023-05-26	Code
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	77.68	No	Improving Knowledge Distillation via Regularizin...	2023-05-26	Code
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	77.5	No	LumiNet: The Bright Side of Perceptual Knowledge...	2023-10-05	Code
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	76.68	No	Information Theoretic Representation Distillation	2021-12-01	Code
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	76.31	No	Knowledge Distillation from A Stronger Teacher	2022-05-21	Code
11	DKD++(T:resnet-32x4, S:resnet-8x4)	76.28	No	Improving Knowledge Distillation via Regularizin...	2023-05-26	Code
12	resnet8x4 (T: resnet32x4 S: resnet8x4)	76.15	No	Wasserstein Contrastive Representation Distillat...	2020-12-15	-
13	ReviewKD++(T:WRN-40-2, S:WRN-40-1)	75.66	No	Improving Knowledge Distillation via Regularizin...	2023-05-26	Code
14	resnet8x4 (T: resnet32x4 S: resnet8x4)	75.63	No	Distilling Knowledge via Knowledge Review	2021-04-19	Code
15	resnet8x4 (T: resnet32x4 S: resnet8x4)	75.51	No	Contrastive Representation Distillation	2019-10-23	Code
16	vgg8 (T:vgg13 S:vgg8)	74.93	No	Information Theoretic Representation Distillation	2021-12-01	Code
17	vgg8 (T:vgg13 S:vgg8)	74.84	No	Distilling Knowledge via Knowledge Review	2021-04-19	Code
18	vgg8 (T:vgg13 S:vgg8)	74.72	No	Wasserstein Contrastive Representation Distillat...	2020-12-15	-
19	vgg8 (T:vgg13 S:vgg8)	74.29	No	Contrastive Representation Distillation	2019-10-23	Code
20	resnet8x4 (T: resnet32x4 S: resnet8x4)	73.33	No	Distilling the Knowledge in a Neural Network	2015-03-09	Code
21	vgg8 (T:vgg13 S:vgg8)	72.98	No	Distilling the Knowledge in a Neural Network	2015-03-09	Code
22	KD++(T:resnet56, S:resnet20)	72.53	No	Improving Knowledge Distillation via Regularizin...	2023-05-26	Code
23	resnet110 (T:resnet110 S:resnet20)	71.99	No	Information Theoretic Representation Distillation	2021-12-01	Code
24	resnet110 (T:resnet110 S:resnet20)	71.88	No	Wasserstein Contrastive Representation Distillat...	2020-12-15	-
25	resnet110 (T:resnet110 S:resnet20)	71.56	No	Contrastive Representation Distillation	2019-10-23	Code
26	DKD++(T:resnet50, S:mobilenetv2)	70.82	No	Improving Knowledge Distillation via Regularizin...	2023-05-26	Code
27	resnet110 (T:resnet110 S:resnet20)	70.67	No	Distilling the Knowledge in a Neural Network	2015-03-09	Code

#1SRD (T:resnet-32x4, S:shufflenet-v2)SOTA
79.86
Top-1 Accuracy (%)· 2023-03-20
Understanding the Role of the Projector in Knowledge Distillation Code
#2shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)
78.76
Top-1 Accuracy (%)· 2024-03-03
Logit Standardization in Knowledge Distillation Code
#3MV-MR (T: CLIP/ViT-B-16 S: resnet50)
78.6
Top-1 Accuracy (%)· 2023-03-21
MV-MR: multi-views and multi-representations for self-supervised learning and knowledge distillation Code
#4resnet8x4 (T: resnet32x4 S: resnet8x4)
78.28
Top-1 Accuracy (%)· 2024-03-03
Logit Standardization in Knowledge Distillation Code
#5resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])SOTA
78.08
Top-1 Accuracy (%)· 2022-03-26
Knowledge Distillation with the Reused Teacher Classifier Code
#6ReviewKD++(T:resnet-32x4, S:shufflenet-v2)
77.93
Top-1 Accuracy (%)· 2023-05-26
Improving Knowledge Distillation via Regularizing Feature Norm and Direction Code
#7ReviewKD++(T:resnet-32x4, S:shufflenet-v1)
77.68
Top-1 Accuracy (%)· 2023-05-26
Improving Knowledge Distillation via Regularizing Feature Norm and Direction Code
#8resnet8x4 (T: resnet32x4 S: resnet8x4)
77.5
Top-1 Accuracy (%)· 2023-10-05
LumiNet: The Bright Side of Perceptual Knowledge Distillation Code
#9resnet8x4 (T: resnet32x4 S: resnet8x4)SOTA
76.68
Top-1 Accuracy (%)· 2021-12-01
Information Theoretic Representation Distillation Code
#10resnet8x4 (T: resnet32x4 S: resnet8x4)
76.31
Top-1 Accuracy (%)· 2022-05-21
Knowledge Distillation from A Stronger Teacher Code
#11DKD++(T:resnet-32x4, S:resnet-8x4)
76.28
Top-1 Accuracy (%)· 2023-05-26
Improving Knowledge Distillation via Regularizing Feature Norm and Direction Code
#12resnet8x4 (T: resnet32x4 S: resnet8x4)SOTA
76.15
Top-1 Accuracy (%)· 2020-12-15
Wasserstein Contrastive Representation Distillation
#13ReviewKD++(T:WRN-40-2, S:WRN-40-1)
75.66
Top-1 Accuracy (%)· 2023-05-26
Improving Knowledge Distillation via Regularizing Feature Norm and Direction Code
#14resnet8x4 (T: resnet32x4 S: resnet8x4)
75.63
Top-1 Accuracy (%)· 2021-04-19
Distilling Knowledge via Knowledge Review Code
#15resnet8x4 (T: resnet32x4 S: resnet8x4)SOTA
75.51
Top-1 Accuracy (%)· 2019-10-23
Contrastive Representation Distillation Code
#16vgg8 (T:vgg13 S:vgg8)
74.93
Top-1 Accuracy (%)· 2021-12-01
Information Theoretic Representation Distillation Code
#17vgg8 (T:vgg13 S:vgg8)
74.84
Top-1 Accuracy (%)· 2021-04-19
Distilling Knowledge via Knowledge Review Code
#18vgg8 (T:vgg13 S:vgg8)
74.72
Top-1 Accuracy (%)· 2020-12-15
Wasserstein Contrastive Representation Distillation
#19vgg8 (T:vgg13 S:vgg8)
74.29
Top-1 Accuracy (%)· 2019-10-23
Contrastive Representation Distillation Code
#20resnet8x4 (T: resnet32x4 S: resnet8x4)SOTA
73.33
Top-1 Accuracy (%)· 2015-03-09
Distilling the Knowledge in a Neural Network Code
#21vgg8 (T:vgg13 S:vgg8)
72.98
Top-1 Accuracy (%)· 2015-03-09
Distilling the Knowledge in a Neural Network Code
#22KD++(T:resnet56, S:resnet20)
72.53
Top-1 Accuracy (%)· 2023-05-26
Improving Knowledge Distillation via Regularizing Feature Norm and Direction Code
#23resnet110 (T:resnet110 S:resnet20)
71.99
Top-1 Accuracy (%)· 2021-12-01
Information Theoretic Representation Distillation Code
#24resnet110 (T:resnet110 S:resnet20)
71.88
Top-1 Accuracy (%)· 2020-12-15
Wasserstein Contrastive Representation Distillation
#25resnet110 (T:resnet110 S:resnet20)
71.56
Top-1 Accuracy (%)· 2019-10-23
Contrastive Representation Distillation Code
#26DKD++(T:resnet50, S:mobilenetv2)
70.82
Top-1 Accuracy (%)· 2023-05-26
Improving Knowledge Distillation via Regularizing Feature Norm and Direction Code
#27resnet110 (T:resnet110 S:resnet20)
70.67
Top-1 Accuracy (%)· 2015-03-09
Distilling the Knowledge in a Neural Network Code