TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Knowledge Base/Mathematical Reasoning/Lila (OOD)

Mathematical Reasoning on Lila (OOD)

Metric: Accuracy (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Accuracy▼Extra DataPaperDate↕Code
1Codex (Few-Shot, 175B)0.586NoLila: A Unified Benchmark for Mathematical Reaso...2022-10-31Code
2Bhāskara-P (Fine-tuned, 2.7B)0.448NoLila: A Unified Benchmark for Mathematical Reaso...2022-10-31Code
3GPT-3 (Few-Shot, 175B)0.384NoLila: A Unified Benchmark for Mathematical Reaso...2022-10-31Code
4Bhāskara-A (Fine-tuned, 2.7B)0.268NoLila: A Unified Benchmark for Mathematical Reaso...2022-10-31Code
5Neo-P (Fine-tuned, 2.7B)0.238NoLila: A Unified Benchmark for Mathematical Reaso...2022-10-31Code
6Neo-A (Fine-tuned, 2.7B)0.177NoLila: A Unified Benchmark for Mathematical Reaso...2022-10-31Code