Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/LPW (GPT-4o)

LPW (GPT-4o)

Reported on 9 benchmarks across 1 task · 1 paper · 8 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing9 results

Code GenerationonHumanEval-ET
Pass@1· 2024-11-21
65.8
best: 87.19 (EG-CFG (DeepSeek-V3-0324))
SOTA
Planning-Driven Programming: A Large Language Model Programming Workflow arXiv:2411.14503
Code GenerationonAPPS
Competition Pass@1· 2024-11-21
34.8
SOTA
Planning-Driven Programming: A Large Language Model Programming Workflow arXiv:2411.14503
Code GenerationonAPPS
Interview Pass@1· 2024-11-21
65.2
SOTA
Planning-Driven Programming: A Large Language Model Programming Workflow arXiv:2411.14503
Code GenerationonAPPS
Introductory Pass@1· 2024-11-21
87.2
SOTA
Planning-Driven Programming: A Large Language Model Programming Workflow arXiv:2411.14503
Code GenerationonAPPS
Pass@1· 2024-11-21
62.6
SOTA
Planning-Driven Programming: A Large Language Model Programming Workflow arXiv:2411.14503
Code GenerationonCodeContests
Test Set pass@1· 2024-11-21
34.7
best: 58.18 (EG-CFG (DeepSeek-V3-0324))
SOTA
Planning-Driven Programming: A Large Language Model Programming Workflow arXiv:2411.14503
Code GenerationonLivecodebench
Acc· 2024-11-21
59.3
best: 91.6 (Xolver)
SOTA
Planning-Driven Programming: A Large Language Model Programming Workflow arXiv:2411.14503
Code GenerationonMBPP-ET
Pass@1· 2024-11-21
65.8
best: 73 (EG-CFG (DeepSeek-V3-0324))
SOTA
Planning-Driven Programming: A Large Language Model Programming Workflow arXiv:2411.14503
Code GenerationonMBPP
Accuracy· 2024-11-21
84.8
best: 96.6 (EG-CFG (DeepSeek-V3-0324))
Planning-Driven Programming: A Large Language Model Programming Workflow arXiv:2411.14503