Question Answering on DROP

Metric: Accuracy (higher is better)

LeaderboardDataset
Loading chart...
#ModelAccuracyExtra DataPaperDateCode
1PaLM 540B (Self Improvement, Self Consistency)83NoLarge Language Models Can Self-Improve2022-10-20-
2PaLM 540B (Self Consistency)78.2NoLarge Language Models Can Self-Improve2022-10-20-
3PaLM 540B (Self Improvement, CoT Prompting)76.2NoLarge Language Models Can Self-Improve2022-10-20-
4PaLM 540B (Self Improvement, Standard-Prompting)71.7NoLarge Language Models Can Self-Improve2022-10-20-
5PaLM 540B (CoT Prompting)70.6NoLarge Language Models Can Self-Improve2022-10-20-
6PaLM 540B (Standard-Prompting)60NoLarge Language Models Can Self-Improve2022-10-20-