TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/PHYRE: A New Benchmark for Physical Reasoning

PHYRE: A New Benchmark for Physical Reasoning

Anton Bakhtin, Laurens van der Maaten, Justin Johnson, Laura Gustafson, Ross Girshick

2019-08-15NeurIPS 2019 12Visual Reasoning
PaperPDFCode(official)Code

Abstract

Understanding and reasoning about physics is an important ability of intelligent agents. We develop the PHYRE benchmark for physical reasoning that contains a set of simple classical mechanics puzzles in a 2D physical environment. The benchmark is designed to encourage the development of learning algorithms that are sample-efficient and generalize well across puzzles. We test several modern learning algorithms on PHYRE and find that these algorithms fall short in solving the puzzles efficiently. We expect that PHYRE will encourage the development of novel sample-efficient agents that learn efficient but useful models of physics. For code and to play PHYRE for yourself, please visit https://player.phyre.ai.

Results

TaskDatasetMetricValueModel
Visual ReasoningPHYRE-1B-WithinAUCCESS77.6DQN
Visual ReasoningPHYRE-1B-CrossAUCCESS36.8DQN

Related Papers

LaViPlan : Language-Guided Visual Path Planning with RLVR2025-07-17Beyond Task-Specific Reasoning: A Unified Conditional Generative Framework for Abstract Visual Reasoning2025-07-15PyVision: Agentic Vision with Dynamic Tooling2025-07-10Orchestrator-Agent Trust: A Modular Agentic AI Visual Classification System with Trust-Aware Orchestration and RAG-Based Reasoning2025-07-09MagiC: Evaluating Multimodal Cognition Toward Grounded Visual Reasoning2025-07-09Skywork-R1V3 Technical Report2025-07-08High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning2025-07-08Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning2025-07-07