TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Pushing the limits of self-supervised ResNets: Can we outp...

Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?

Nenad Tomasev, Ioana Bica, Brian McWilliams, Lars Buesing, Razvan Pascanu, Charles Blundell, Jovana Mitrovic

2022-01-13Self-Supervised Image ClassificationImage ClassificationRepresentation LearningSelf-Supervised LearningSemantic SegmentationLinear evaluationSemi-Supervised Image Classification
PaperPDFCode(official)

Abstract

Despite recent progress made by self-supervised methods in representation learning with residual networks, they still underperform supervised learning on the ImageNet classification benchmark, limiting their applicability in performance-critical settings. Building on prior theoretical insights from ReLIC [Mitrovic et al., 2021], we include additional inductive biases into self-supervised learning. We propose a new self-supervised representation learning method, ReLICv2, which combines an explicit invariance loss with a contrastive objective over a varied set of appropriately constructed data views to avoid learning spurious correlations and obtain more informative representations. ReLICv2 achieves $77.1\%$ top-$1$ accuracy on ImageNet under linear evaluation on a ResNet50, thus improving the previous state-of-the-art by absolute $+1.5\%$; on larger ResNet models, ReLICv2 achieves up to $80.6\%$ outperforming previous self-supervised approaches with margins up to $+2.3\%$. Most notably, ReLICv2 is the first unsupervised representation learning method to consistently outperform the supervised baseline in a like-for-like comparison over a range of ResNet architectures. Using ReLICv2, we also learn more robust and transferable representations that generalize better out-of-distribution than previous work, both on image classification and semantic segmentation. Finally, we show that despite using ResNet encoders, ReLICv2 is comparable to state-of-the-art self-supervised vision transformers.

Results

TaskDatasetMetricValueModel
Semantic SegmentationCityscapes valmIoU75.2ReLICv2
Semantic SegmentationCityscapes valmIoU74.6BYOL
Image ClassificationObjectNetTop-1 Accuracy25.9RELICv2
Image ClassificationObjectNetTop-1 Accuracy23.8RELIC
Image ClassificationObjectNetTop-1 Accuracy23BYOL
Image ClassificationObjectNetTop-1 Accuracy14.6SimCLR
Image ClassificationImageNet - 1% labeled dataTop 5 Accuracy81.3RELICv2
Semi-Supervised Image ClassificationImageNet - 1% labeled dataTop 5 Accuracy81.3RELICv2
10-shot image generationCityscapes valmIoU75.2ReLICv2
10-shot image generationCityscapes valmIoU74.6BYOL

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17