TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Reasoning/Visual Reasoning/NLVR2 Dev

Visual Reasoning on NLVR2 Dev

Metric: Accuracy (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Accuracy▼Extra DataPaperDate↕Code
1BEiT-391.51NoImage as a Foreign Language: BEiT Pretraining fo...2022-08-22Code
2X2-VLM (large)88.7NoX$^2$-VLM: All-In-One Pre-trained Model For Visi...2022-11-22Code
3XFM (base)87.6NoToward Building General Foundation Models for La...2023-01-12Code
4X2-VLM (base)86.2NoX$^2$-VLM: All-In-One Pre-trained Model For Visi...2022-11-22Code
5CoCa86.1NoCoCa: Contrastive Captioners are Image-Text Foun...2022-05-04Code
6VLMo85.64NoVLMo: Unified Vision-Language Pre-Training with ...2021-11-03Code
7VK-OOD84.6No--Code
8SimVLM84.53NoSimVLM: Simple Visual Language Model Pretraining...2021-08-24Code
9X-VLM (base)84.41NoMulti-Grained Vision Language Pre-Training: Alig...2021-11-16Code
10VK-OOD83.9NoDifferentiable Outlier Detection Enable Robust D...2023-02-11Code
11ALBEF (14M)83.14NoAlign before Fuse: Vision and Language Represent...2021-07-16Code
12SOHO76.37NoSeeing Out of tHe bOx: End-to-End Pre-training f...2021-04-07Code
13ViLT-B/3275.7NoViLT: Vision-and-Language Transformer Without Co...2021-02-05Code
14LXMERT (Pre-train + scratch)74.9NoLXMERT: Learning Cross-Modality Encoder Represen...2019-08-20Code
15VisualBERT66.7NoVisualBERT: A Simple and Performant Baseline for...2019-08-09Code