Metric: pass@1 (higher is better)
| # | Model↕ | pass@1▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | claude-3-5-sonnet | 0.679 | No | A Case Study of Web App Coding with OpenAI Reaso... | 2024-09-19 | Code |
| 2 | o1-mini | 0.667 | No | A Case Study of Web App Coding with OpenAI Reaso... | 2024-09-19 | Code |
| 3 | o1-preview | 0.652 | No | A Case Study of Web App Coding with OpenAI Reaso... | 2024-09-19 | Code |
| 4 | gpt-4o-2024-08-06 | 0.531 | No | A Case Study of Web App Coding with OpenAI Reaso... | 2024-09-19 | Code |
| 5 | deepseek-v2.5 | 0.49 | No | A Case Study of Web App Coding with OpenAI Reaso... | 2024-09-19 | Code |
| 6 | mistral-large-2 | 0.449 | No | A Case Study of Web App Coding with OpenAI Reaso... | 2024-09-19 | Code |