Metric: Top-1 Accuracy (%) (higher is better)
| # | Model↕ | Top-1 Accuracy (%)▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | SRD (T:resnet-32x4, S:shufflenet-v2) | 79.86 | No | Understanding the Role of the Projector in Knowl... | 2023-03-20 | Code |
| 2 | shufflenet-v2(T:resnet-32x4, S:shufflenet-v2) | 78.76 | No | Logit Standardization in Knowledge Distillation | 2024-03-03 | Code |
| 3 | MV-MR (T: CLIP/ViT-B-16 S: resnet50) | 78.6 | No | MV-MR: multi-views and multi-representations for... | 2023-03-21 | Code |
| 4 | resnet8x4 (T: resnet32x4 S: resnet8x4) | 78.28 | No | Logit Standardization in Knowledge Distillation | 2024-03-03 | Code |
| 5 | resnet8x4 (T: resnet32x4 S: resnet8x4 [modified]) | 78.08 | No | Knowledge Distillation with the Reused Teacher C... | 2022-03-26 | Code |
| 6 | ReviewKD++(T:resnet-32x4, S:shufflenet-v2) | 77.93 | No | Improving Knowledge Distillation via Regularizin... | 2023-05-26 | Code |
| 7 | ReviewKD++(T:resnet-32x4, S:shufflenet-v1) | 77.68 | No | Improving Knowledge Distillation via Regularizin... | 2023-05-26 | Code |
| 8 | resnet8x4 (T: resnet32x4 S: resnet8x4) | 77.5 | No | LumiNet: The Bright Side of Perceptual Knowledge... | 2023-10-05 | Code |
| 9 | resnet8x4 (T: resnet32x4 S: resnet8x4) | 76.68 | No | Information Theoretic Representation Distillation | 2021-12-01 | Code |
| 10 | resnet8x4 (T: resnet32x4 S: resnet8x4) | 76.31 | No | Knowledge Distillation from A Stronger Teacher | 2022-05-21 | Code |
| 11 | DKD++(T:resnet-32x4, S:resnet-8x4) | 76.28 | No | Improving Knowledge Distillation via Regularizin... | 2023-05-26 | Code |
| 12 | resnet8x4 (T: resnet32x4 S: resnet8x4) | 76.15 | No | Wasserstein Contrastive Representation Distillat... | 2020-12-15 | - |
| 13 | ReviewKD++(T:WRN-40-2, S:WRN-40-1) | 75.66 | No | Improving Knowledge Distillation via Regularizin... | 2023-05-26 | Code |
| 14 | resnet8x4 (T: resnet32x4 S: resnet8x4) | 75.63 | No | Distilling Knowledge via Knowledge Review | 2021-04-19 | Code |
| 15 | resnet8x4 (T: resnet32x4 S: resnet8x4) | 75.51 | No | Contrastive Representation Distillation | 2019-10-23 | Code |
| 16 | vgg8 (T:vgg13 S:vgg8) | 74.93 | No | Information Theoretic Representation Distillation | 2021-12-01 | Code |
| 17 | vgg8 (T:vgg13 S:vgg8) | 74.84 | No | Distilling Knowledge via Knowledge Review | 2021-04-19 | Code |
| 18 | vgg8 (T:vgg13 S:vgg8) | 74.72 | No | Wasserstein Contrastive Representation Distillat... | 2020-12-15 | - |
| 19 | vgg8 (T:vgg13 S:vgg8) | 74.29 | No | Contrastive Representation Distillation | 2019-10-23 | Code |
| 20 | resnet8x4 (T: resnet32x4 S: resnet8x4) | 73.33 | No | Distilling the Knowledge in a Neural Network | 2015-03-09 | Code |
| 21 | vgg8 (T:vgg13 S:vgg8) | 72.98 | No | Distilling the Knowledge in a Neural Network | 2015-03-09 | Code |
| 22 | KD++(T:resnet56, S:resnet20) | 72.53 | No | Improving Knowledge Distillation via Regularizin... | 2023-05-26 | Code |
| 23 | resnet110 (T:resnet110 S:resnet20) | 71.99 | No | Information Theoretic Representation Distillation | 2021-12-01 | Code |
| 24 | resnet110 (T:resnet110 S:resnet20) | 71.88 | No | Wasserstein Contrastive Representation Distillat... | 2020-12-15 | - |
| 25 | resnet110 (T:resnet110 S:resnet20) | 71.56 | No | Contrastive Representation Distillation | 2019-10-23 | Code |
| 26 | DKD++(T:resnet50, S:mobilenetv2) | 70.82 | No | Improving Knowledge Distillation via Regularizin... | 2023-05-26 | Code |
| 27 | resnet110 (T:resnet110 S:resnet20) | 70.67 | No | Distilling the Knowledge in a Neural Network | 2015-03-09 | Code |