TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/A Large-scale Study of Representation Learning with the Vi...

A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark

Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby

2019-10-01arXiv 2020 2Image ClassificationRepresentation Learning
PaperPDFCode(official)Code

Abstract

Representation learning promises to unlock deep learning for the long tail of vision tasks without expensive labelled datasets. Yet, the absence of a unified evaluation for general visual representations hinders progress. Popular protocols are often too constrained (linear classification), limited in diversity (ImageNet, CIFAR, Pascal-VOC), or only weakly related to representation quality (ELBO, reconstruction error). We present the Visual Task Adaptation Benchmark (VTAB), which defines good representations as those that adapt to diverse, unseen tasks with few examples. With VTAB, we conduct a large-scale study of many popular publicly-available representation learning algorithms. We carefully control confounders such as architecture and tuning budget. We address questions like: How effective are ImageNet representations beyond standard natural datasets? How do representations trained via generative and discriminative models compare? To what extent can self-supervision replace labels? And, how close are we to general visual representations?

Results

TaskDatasetMetricValueModel
Image ClassificationVTAB-1kTop-1 Accuracy72.7S4L-Exemplar-ResNet50-LargeHyperSweep
Image ClassificationVTAB-1kTop-1 Accuracy71.5S4L-Rotation-ResNet50-LargeHyperSweep
Image ClassificationVTAB-1kTop-1 Accuracy71.2ImageNet-ResNet50-LargeHyperSweep
Image ClassificationVTAB-1kTop-1 Accuracy67.5S4L-Rotation-ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy67S4L-Exemplar-ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy65.6ImageNet-ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy64.8S4L-10%-Rotation-ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy63.9S4L-10%-Exemplar-ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy61.6ImageNet-10%-ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy59.5SelfSup-Rotation-ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy59.2ResNet50-LargeHyperSweep
Image ClassificationVTAB-1kTop-1 Accuracy59.1BigBiGAN-ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy57.5SelfSup-Exemplar-ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy51.1SelfSup-Jigsaw-ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy50.8SelfSup-RelativePatchLoc-ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy44Unconditional-BigGAN-ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy42.1ResNet50
Image ClassificationVTAB-1kTop-1 Accuracy37.5VAE
Image ClassificationVTAB-1kTop-1 Accuracy37.3WAE-MMD
Image ClassificationVTAB-1kTop-1 Accuracy35.3Conditional-BigGAN
Image ClassificationVTAB-1kTop-1 Accuracy32WAE-GAN
Image ClassificationVTAB-1kTop-1 Accuracy31WAE-UKL

Related Papers

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Boosting Team Modeling through Tempo-Relational Representation Learning2025-07-17