Semantic Parsing on Spider 2.0
Metric: Success Rate (higher is better)
LeaderboardDataset
Loading chart...
Results
Submit a result| # | Model↕ | Success Rate▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | Spider-Agent + o1-preview | 17.03 | No | Spider 2.0: Evaluating Language Models on Real-W... | 2024-11-12 | - |
| 2 | Spider-Agent + GPT-4o | 10.13 | No | Spider 2.0: Evaluating Language Models on Real-W... | 2024-11-12 | - |
| 3 | Spider-Agent + Claude-3.5-Sonnect | 9.02 | No | Spider 2.0: Evaluating Language Models on Real-W... | 2024-11-12 | - |
| 4 | Spider-Agent + GPT-4 | 8.86 | No | Spider 2.0: Evaluating Language Models on Real-W... | 2024-11-12 | - |
| 5 | Spider-Agent + Qwen2.5-72B | 6.17 | No | Spider 2.0: Evaluating Language Models on Real-W... | 2024-11-12 | - |
| 6 | Spider-Agent + DeepSeek-V2.5 | 5.22 | No | Spider 2.0: Evaluating Language Models on Real-W... | 2024-11-12 | - |
| 7 | Spider-Agent + Gemini-Pro-1.5 | 2.53 | No | Spider 2.0: Evaluating Language Models on Real-W... | 2024-11-12 | - |
| 8 | Spider-Agent + Llama-3.1-405B | 2.21 | No | Spider 2.0: Evaluating Language Models on Real-W... | 2024-11-12 | - |