TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Question Answering/NewsQA

Question Answering on NewsQA

Metric: EM (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕EM▼Extra DataPaperDate↕Code
1OpenAI/o3-2025-01-31-high92.52Yeso3-mini vs DeepSeek-R1: Which One is Safer?2025-01-30Code
2Riple/Saanvi-v0.5-DeepAnalysis92.14YesDeepSense: A Unified Deep Learning Framework for...2016-11-07Code
3OpenAI/o4-mini-2025-05-01-high88.24YesThinking Like Transformers2021-06-13Code
4OpenAI/o1-2024-12-17-high81.44Yes0/1 Deep Neural Networks via Block Coordinate De...2022-06-19-
5deepseek-r180.57YesDeepSeek-R1: Incentivizing Reasoning Capability ...2025-01-22Code
6Anthropic/claude-3-7-sonnet74.23No---
7Riple/Saanvi-v0.172.61NoTime-series Transformer Generative Adversarial N...2022-05-23Code
8xAI/grok-3-121270.57YesXAI for Transformers: Better Explanations throug...2022-02-15Code
9OpenAI/GPT-4o70.21YesGPT-4o as the Gold Standard: A Scalable and Gene...2024-10-03-
10Google/Gemini 2.5 Pro68.75YesGemini 1.5: Unlocking multimodal understanding a...2024-03-08Code
11BERT+ASGen54.7No---
12DecaProp53.1NoDensely Connected Attention Propagation for Read...2018-11-10Code
13MINIMAL(Dyn)50.1YesEfficient and Robust Question Answering from Min...2018-05-21Code
14AMANDA48.4NoA Question-Focused Multi-Factor Attention Networ...2018-01-25Code
15FastQAExt43.7YesMaking Neural QA as Simple as Possible but not S...2017-03-14Code