Visual Question Answering (VQA) on ScanQA Test w/ objects

Metric: BLEU-1 (higher is better)

LeaderboardDataset

Loading chart...

Results

Submit a result

Sort:

#	Model↕	BLEU-1▼	Extra Data	Paper	Date↕	Code
1	NaviLLM	39.73	No	Towards Learning a Generalist Model for Embodied...	2023-12-04	Code
2	3D-LLM (BLIP2-flant5)	38.3	No	3D-LLM: Injecting the 3D World into Large Langua...	2023-07-24	Code
3	3D-LLM (BLIP2-opt)	37.3	No	3D-LLM: Injecting the 3D World into Large Langua...	2023-07-24	Code
4	BridgeQA	34.49	No	Bridging the Gap between 2D and 3D Visual Questi...	2024-02-24	Code
5	3D-LLM (flamingo)	32.6	No	3D-LLM: Injecting the 3D World into Large Langua...	2023-07-24	Code
6	ScanQA	31.56	No	ScanQA: 3D Question Answering for Spatial Scene ...	2021-12-20	Code
7	VoteNet+MCAN	29.46	No	ScanQA: 3D Question Answering for Spatial Scene ...	2021-12-20	Code
8	ScanRefer+MCAN	27.85	No	ScanQA: 3D Question Answering for Spatial Scene ...	2021-12-20	Code

#1NaviLLMSOTA
39.73
BLEU-1· 2023-12-04
Towards Learning a Generalist Model for Embodied Navigation Code
#23D-LLM (BLIP2-flant5)SOTA
38.3
BLEU-1· 2023-07-24
3D-LLM: Injecting the 3D World into Large Language Models Code
#33D-LLM (BLIP2-opt)
37.3
BLEU-1· 2023-07-24
3D-LLM: Injecting the 3D World into Large Language Models Code
#4BridgeQA
34.49
BLEU-1· 2024-02-24
Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA Code
#53D-LLM (flamingo)
32.6
BLEU-1· 2023-07-24
3D-LLM: Injecting the 3D World into Large Language Models Code
#6ScanQASOTA
31.56
BLEU-1· 2021-12-20
ScanQA: 3D Question Answering for Spatial Scene Understanding Code
#7VoteNet+MCAN
29.46
BLEU-1· 2021-12-20
ScanQA: 3D Question Answering for Spatial Scene Understanding Code
#8ScanRefer+MCAN
27.85
BLEU-1· 2021-12-20
ScanQA: 3D Question Answering for Spatial Scene Understanding Code