Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Robots
/
Robot Manipulation
/
CALVIN
Robot Manipulation on CALVIN
Metric: avg. sequence length (D to D) (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
avg. sequence length (D to D) (best first)
avg. sequence length (D to D) (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
avg. sequence length (D to D)
▼
Extra Data
Paper
Date
↕
Code
1
DreamVLA
4.44
No
DreamVLA: A Vision-Language-Action Model Dreamed...
2025-07-06
Code
2
VPP
4.29
No
Video Prediction Policy: A Generalist Robot Poli...
2024-12-19
Code
3
RoboVLMs
4.25
No
Towards Generalist Robot Policies: What Matters ...
2024-12-18
Code
4
Openhelix
4.08
No
OpenHelix: A Short Survey, Empirical Analysis, a...
2025-05-06
Code
5
UP-VLA
4.08
No
UP-VLA: A Unified Understanding and Prediction M...
2025-01-31
-
6
GR-MG
4.04
No
GR-MG: Leveraging Partially Annotated Data via M...
2024-08-26
Code
7
MoDE
4.01
No
Efficient Diffusion Transformer Policies with Mi...
2024-12-17
Code
8
RoboUniView
3.855
No
RoboUniView: Visual-Language Model with Unified ...
2024-06-27
Code
9
UniVLA
3.8
No
UniVLA: Learning to Act Anywhere with Task-centr...
2025-05-09
Code
10
RoboDual
3.66
No
Towards Synergistic, Generalized, and Efficient ...
2024-10-10
-
11
VidMan
3.42
No
VidMan: Exploiting Implicit Dynamics from Video ...
2024-11-14
-
12
3DDA
3.35
No
3D Diffuser Actor: Policy Diffusion with 3D Scen...
2024-02-16
Code
13
OpenVLA
3.27
No
OpenVLA: An Open-Source Vision-Language-Action M...
2024-06-13
Code
14
3D Diffusor Actor
3.27
No
3D Diffuser Actor: Policy Diffusion with 3D Scen...
2024-02-16
Code
15
GR-1
3.06
No
Unleashing Large-Scale Video Generative Pre-trai...
2023-12-20
Code
16
Roboflamingo
2.47
No
Vision-Language Foundation Models as Effective R...
2023-11-02
-
17
LCB
1.78
No
From LLMs to Actions: Latent Codes as Bridges in...
2024-05-08
-
18
Uni-Pi
0.92
No
Learning Universal Policies via Text-Guided Vide...
2023-01-31
-
19
RT-1
0.9
No
RT-1: Robotics Transformer for Real-World Contro...
2022-12-13
Code
#1
DreamVLA
SOTA
4.44
avg. sequence length (D to D)
· 2025-07-06
DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
Code
#2
VPP
SOTA
4.29
avg. sequence length (D to D)
· 2024-12-19
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations
Code
#3
RoboVLMs
SOTA
4.25
avg. sequence length (D to D)
· 2024-12-18
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models
Code
#4
Openhelix
4.08
avg. sequence length (D to D)
· 2025-05-06
OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation
Code
#5
UP-VLA
4.08
avg. sequence length (D to D)
· 2025-01-31
UP-VLA: A Unified Understanding and Prediction Model for Embodied Agent
#6
GR-MG
SOTA
4.04
avg. sequence length (D to D)
· 2024-08-26
GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal-Conditioned Policy
Code
#7
MoDE
4.01
avg. sequence length (D to D)
· 2024-12-17
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
Code
#8
RoboUniView
SOTA
3.855
avg. sequence length (D to D)
· 2024-06-27
RoboUniView: Visual-Language Model with Unified View Representation for Robotic Manipulation
Code
#9
UniVLA
3.8
avg. sequence length (D to D)
· 2025-05-09
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
Code
#10
RoboDual
3.66
avg. sequence length (D to D)
· 2024-10-10
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
#11
VidMan
3.42
avg. sequence length (D to D)
· 2024-11-14
VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation
#12
3DDA
SOTA
3.35
avg. sequence length (D to D)
· 2024-02-16
3D Diffuser Actor: Policy Diffusion with 3D Scene Representations
Code
#13
OpenVLA
3.27
avg. sequence length (D to D)
· 2024-06-13
OpenVLA: An Open-Source Vision-Language-Action Model
Code
#14
3D Diffusor Actor
3.27
avg. sequence length (D to D)
· 2024-02-16
3D Diffuser Actor: Policy Diffusion with 3D Scene Representations
Code
#15
GR-1
SOTA
3.06
avg. sequence length (D to D)
· 2023-12-20
Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation
Code
#16
Roboflamingo
SOTA
2.47
avg. sequence length (D to D)
· 2023-11-02
Vision-Language Foundation Models as Effective Robot Imitators
#17
LCB
1.78
avg. sequence length (D to D)
· 2024-05-08
From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control
#18
Uni-Pi
SOTA
0.92
avg. sequence length (D to D)
· 2023-01-31
Learning Universal Policies via Text-Guided Video Generation
#19
RT-1
SOTA
0.9
avg. sequence length (D to D)
· 2022-12-13
RT-1: Robotics Transformer for Real-World Control at Scale
Code