Question Answering on DROP
Metric: Accuracy (higher is better)
LeaderboardDataset
Loading chart...
Results
Submit a result| # | Model↕ | Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | PaLM 540B (Self Improvement, Self Consistency) | 83 | No | Large Language Models Can Self-Improve | 2022-10-20 | - |
| 2 | PaLM 540B (Self Consistency) | 78.2 | No | Large Language Models Can Self-Improve | 2022-10-20 | - |
| 3 | PaLM 540B (Self Improvement, CoT Prompting) | 76.2 | No | Large Language Models Can Self-Improve | 2022-10-20 | - |
| 4 | PaLM 540B (Self Improvement, Standard-Prompting) | 71.7 | No | Large Language Models Can Self-Improve | 2022-10-20 | - |
| 5 | PaLM 540B (CoT Prompting) | 70.6 | No | Large Language Models Can Self-Improve | 2022-10-20 | - |
| 6 | PaLM 540B (Standard-Prompting) | 60 | No | Large Language Models Can Self-Improve | 2022-10-20 | - |