Big Self-Supervised Models are Strong Semi-Supervised Learners

Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, Geoffrey Hinton

2020-06-17NeurIPS 2020 12Self-Supervised Image Classification Semi-Supervised Image Classification

Paper PDF Code(official)Code Code Code Code Code Code Code Code

Abstract

One paradigm for learning from few labeled examples while making best use of a large amount of unlabeled data is unsupervised pretraining followed by supervised fine-tuning. Although this paradigm uses unlabeled data in a task-agnostic way, in contrast to common approaches to semi-supervised learning for computer vision, we show that it is surprisingly effective for semi-supervised learning on ImageNet. A key ingredient of our approach is the use of big (deep and wide) networks during pretraining and fine-tuning. We find that, the fewer the labels, the more this approach (task-agnostic use of unlabeled data) benefits from a bigger network. After fine-tuning, the big network can be further improved and distilled into a much smaller one with little loss in classification accuracy by using the unlabeled examples for a second time, but in a task-specific way. The proposed semi-supervised learning algorithm can be summarized in three steps: unsupervised pretraining of a big ResNet model using SimCLRv2, supervised fine-tuning on a few labeled examples, and distillation with unlabeled examples for refining and transferring the task-specific knowledge. This procedure achieves 73.9% ImageNet top-1 accuracy with just 1% of the labels ($\le$13 labeled images per class) using ResNet-50, a $10\times$ improvement in label efficiency over the previous state-of-the-art. With 10% of labels, ResNet-50 trained with our method achieves 77.5% top-1 accuracy, outperforming standard supervised training with all of the labels.

Related Papers

ViTSGMM: A Robust Semi-Supervised Image Recognition Network Using Sparse Labels2025-06-04 Applications and Effect Evaluation of Generative Adversarial Networks in Semi-Supervised Learning2025-05-26 Simple Semi-supervised Knowledge Distillation from Vision-Language Models via $\mathbf{\texttt{D}}$ual-$\mathbf{\texttt{H}}$ead $\mathbf{\texttt{O}}$ptimization2025-05-12 Weakly Semi-supervised Whole Slide Image Classification by Two-level Cross Consistency Supervision2025-04-16 Diff-SySC: An Approach Using Diffusion Models for Semi-Supervised Image Classification2025-02-25 Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective2024-10-16 SynCo: Synthetic Hard Negatives in Contrastive Learning for Better Unsupervised Visual Representations2024-10-03 Unsupervised Representation Learning by Balanced Self Attention Matching2024-08-04