Visual Question Answering (VQA) on 6-DoF SpatialBench

Metric: Position-rel (higher is better)

LeaderboardDataset

Loading chart...

Results

Submit a result

#	Model↕	Position-rel▼	Extra Data	Paper	Date↕	Code
1	SoFar	59.6	No	SoFar: Language-Grounded Orientation Bridges Spa...	2025-02-18	Code
2	SpatialBot	50.9	No	SpatialBot: Precise Spatial Understanding with V...	2024-06-19	Code
3	GPT-4o	49.4	No	GPT-4o System Card	2024-10-25	-
4	RoboPoint	43.8	No	RoboPoint: A Vision-Language Model for Spatial A...	2024-06-15	-
5	SpaceMantis	33.6	No	SpatialVLM: Endowing Vision-Language Models with...	2024-01-22	-
6	SpaceLLaVA	32.4	No	SpatialVLM: Endowing Vision-Language Models with...	2024-01-22	-
7	LLaVA-1.5	30.9	No	Improved Baselines with Visual Instruction Tuning	2023-10-05	Code