Question Answering on QuALITY

Metric: Accuracy (higher is better)

LeaderboardDataset
Loading chart...
#ModelAccuracyExtra DataPaperDateCode
1Claude 1.3 (5-shot)84.1No---
2Claude 2 (5-shot)83.2No---
3RAPTOR + GPT-4 (June 2023)82.6NoRAPTOR: Recursive Abstractive Processing for Tre...2024-01-31Code
4Claude Instant 1.1 (5-shot)80.5No---