TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Self-Knowledge Distillation with Progressive Refinement of...

Self-Knowledge Distillation with Progressive Refinement of Targets

Kyungyul Kim, ByeongMoon Ji, Doyoung Yoon, Sangheum Hwang

2020-06-22ICCV 2021 10Machine TranslationImage ClassificationMultimodal Machine TranslationKnowledge Distillationobject-detectionObject Detection
PaperPDFCode(official)

Abstract

The generalization capability of deep neural networks has been substantially improved by applying a wide spectrum of regularization methods, e.g., restricting function space, injecting randomness during training, augmenting data, etc. In this work, we propose a simple yet effective regularization method named progressive self-knowledge distillation (PS-KD), which progressively distills a model's own knowledge to soften hard targets (i.e., one-hot vectors) during training. Hence, it can be interpreted within a framework of knowledge distillation as a student becomes a teacher itself. Specifically, targets are adjusted adaptively by combining the ground-truth and past predictions from the model itself. We show that PS-KD provides an effect of hard example mining by rescaling gradients according to difficulty in classifying examples. The proposed method is applicable to any supervised learning tasks with hard targets and can be easily combined with existing regularization methods to further enhance the generalization performance. Furthermore, it is confirmed that PS-KD achieves not only better accuracy, but also provides high quality of confidence estimates in terms of calibration as well as ordinal ranking. Extensive experimental results on three different tasks, image classification, object detection, and machine translation, demonstrate that our method consistently improves the performance of the state-of-the-art baselines. The code is available at https://github.com/lgcnsai/PS-KD-Pytorch.

Results

TaskDatasetMetricValueModel
Machine TranslationIWSLT2015 German-EnglishBLEU score36.2PS-KD
Machine TranslationIWSLT2015 English-GermanBLEU score30PS-KD
Machine TranslationMulti30KBLUE (DE-EN)32.3PS-KD
Image ClassificationCIFAR-100Percentage correct86.41PyramidNet-200 + Shakedrop + Cutmix + PS-KD
Multimodal Machine TranslationMulti30KBLUE (DE-EN)32.3PS-KD

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Uncertainty-Aware Cross-Modal Knowledge Distillation with Prototype Learning for Multimodal Brain-Computer Interfaces2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17