Papers With Code 2 | ML Benchmarks, SotA Results & Code

The Visual Task Adaptation Benchmark (VTAB) is a benchmark designed to evaluate general visual representations². It consists of a diverse and challenging suite of tasks². The benchmark defines a good general visual representation as one that yields good performance on unseen tasks, when trained on limited task-specific data².

The VTAB benchmark contains the following 19 tasks that are derived from public datasets¹:

Caltech101
CIFAR-100
CLEVR distance prediction
CLEVR counting
Diabetic Rethinopathy
Dmlab Frames
dSprites orientation prediction
dSprites location prediction
Describable Textures Dataset (DTD)
EuroSAT
KITTI distance prediction
102 Category Flower Dataset
Oxford IIIT Pet dataset
PatchCamelyon
Resisc45
Small NORB azimuth prediction
Small NORB elevation prediction
SUN397
SVHN

The given model is independently fine-tuned for solving each of the above tasks¹. Average accuracy across all tasks is used to measure the model's performance¹. Detailed description of all tasks, evaluation protocol, and other details can be found in the VTAB paper¹.

(1) Visual Task Adaptation Benchmark. https://google-research.github.io/task_adaptation/. (2) GitHub - google-research/task_adaptation. https://github.com/google-research/task_adaptation. (3) GitHub - KMnP/vpt: ️ Visual Prompt Tuning [ECCV 2022] https://arxiv .... https://github.com/KMnP/vpt.

VTAB

Related Benchmarks