TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Question Answering/SIQA

Question Answering on SIQA

Metric: Accuracy (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Accuracy▼Extra DataPaperDate↕Code
1Unicorn 11B (fine-tuned)83.2NoUNICORN on RAINBOW: A Universal Commonsense Reas...2021-03-24Code
2LLaMA-2 13B + MixLoRA82.5NoMixLoRA: Enhancing Large Language Models Fine-Tu...2024-04-22Code
3CompassMTL 567M with Tailor82.2NoTask Compass: Scaling Multi-task Pre-training wi...2022-10-12Code
4CompassMTL 567M81.7NoTask Compass: Scaling Multi-task Pre-training wi...2022-10-12Code
5LLaMA-3 8B+MoSLoRA (fine-tuned)81NoMixture-of-Subspaces in Low-Rank Adaptation2024-06-16Code
6DeBERTa-Large 304M80.2NoTwo is Better than Many? Binary Classification a...2022-10-29Code
7DeBERTa-Large 304M (classification-based)79.9NoTwo is Better than Many? Binary Classification a...2022-10-29Code
8UnifiedQA 3B79.8NoUnifiedQA: Crossing Format Boundaries With a Sin...2020-05-02Code
9ExDeBERTa 567M79.6NoTask Compass: Scaling Multi-task Pre-training wi...2022-10-12Code
10LLaMA-3 8B + MixLoRA78.8NoMixLoRA: Enhancing Large Language Models Fine-Tu...2024-04-22Code
11LLaMA-2 7B + MixLoRA78NoMixLoRA: Enhancing Large Language Models Fine-Tu...2024-04-22Code
12RoBERTa-Large 355M (fine-tuned)76.7NoRoBERTa: A Robustly Optimized BERT Pretraining A...2019-07-26Code
13BERT-large 340M (fine-tuned)64.5NoSocialIQA: Commonsense Reasoning about Social In...2019-04-22Code
14BERT-base 110M (fine-tuned)63.1NoSocialIQA: Commonsense Reasoning about Social In...2019-04-22Code
15GPT-1 117M (fine-tuned)63NoSocialIQA: Commonsense Reasoning about Social In...2019-04-22Code
16phi-1.5-web 1.3B (zero-shot)53NoTextbooks Are All You Need II: phi-1.5 technical...2023-09-11Code
17phi-1.5 1.3B (zero-shot)52.6NoTextbooks Are All You Need II: phi-1.5 technical...2023-09-11Code
18LLaMA 65B (zero-shot)52.3NoLLaMA: Open and Efficient Foundation Language Mo...2023-02-27Code
19Chinchilla (zero-shot)51.3NoTraining Compute-Optimal Large Language Models2022-03-29Code
20Gopher (zero-shot)50.6NoScaling Language Models: Methods, Analysis & Ins...2021-12-08Code
21LLaMA 13B (zero-shot)50.4NoLLaMA: Open and Efficient Foundation Language Mo...2023-02-27Code
22LLaMA 33B (zero-shot)50.4NoLLaMA: Open and Efficient Foundation Language Mo...2023-02-27Code
23LLaMA 7B (zero-shot)48.9NoLLaMA: Open and Efficient Foundation Language Mo...2023-02-27Code
24Random chance baseline33.3NoSocialIQA: Commonsense Reasoning about Social In...2019-04-22Code