TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/3D Diffusion Policy: Generalizable Visuomotor Policy Learn...

3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations

Yanjie Ze, Gu Zhang, Kangning Zhang, Chenyuan Hu, Muhan Wang, Huazhe Xu

2024-03-06Imitation LearningRobot Manipulation
PaperPDFCode(official)

Abstract

Imitation learning provides an efficient way to teach robots dexterous skills; however, learning complex skills robustly and generalizablely usually consumes large amounts of human demonstrations. To tackle this challenging problem, we present 3D Diffusion Policy (DP3), a novel visual imitation learning approach that incorporates the power of 3D visual representations into diffusion policies, a class of conditional action generative models. The core design of DP3 is the utilization of a compact 3D visual representation, extracted from sparse point clouds with an efficient point encoder. In our experiments involving 72 simulation tasks, DP3 successfully handles most tasks with just 10 demonstrations and surpasses baselines with a 24.2% relative improvement. In 4 real robot tasks, DP3 demonstrates precise control with a high success rate of 85%, given only 40 demonstrations of each task, and shows excellent generalization abilities in diverse aspects, including space, viewpoint, appearance, and instance. Interestingly, in real robot experiments, DP3 rarely violates safety requirements, in contrast to baseline methods which frequently do, necessitating human intervention. Our extensive evaluation highlights the critical importance of 3D representations in real-world robot learning. Videos, code, and data are available on https://3d-diffusion-policy.github.io .

Results

TaskDatasetMetricValueModel
Robot ManipulationMimicGenSucc. Rate (12 tasks, 100 demo/task)23.9DP3 (Evaluated in EquiDiff)
Robot ManipulationMimicGenSucc. Rate (12 tasks, 1000 demo/task)56.8DP3 (Evaluated in EquiDiff)
Robot ManipulationMimicGenSucc. Rate (12 tasks, 200 demo/task)35.1DP3 (Evaluated in EquiDiff)

Related Papers

The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner2025-07-17Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)2025-07-17EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow2025-07-08Fast Bilateral Teleoperation and Imitation Learning Using Sensorless Force Control via Accurate Dynamics Model2025-07-08LeAD: The LLM Enhanced Planning System Converged with End-to-end Autonomous Driving2025-07-08DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge2025-07-06Geometry-aware 4D Video Generation for Robot Manipulation2025-07-01Advancing Learnable Multi-Agent Pathfinding Solvers with Active Fine-Tuning2025-06-30