Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Referring expression generation
/
ColonINST-v1 (Unseen)
Referring expression generation on ColonINST-v1 (Unseen)
Metric: Accuray (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
Accuray (best first)
Accuray (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Accuray
▼
Extra Data
Paper
Date
↕
Code
1
ColonGPT (w/ LoRA, w/o extra data)
80.18
No
Frontiers in Intelligent Colonoscopy
2024-10-22
Code
2
MobileVLM-1.7B (w/ LoRA, w/ extra data)
78.03
No
MobileVLM : A Fast, Strong and Open Vision Langu...
2023-12-28
Code
3
LLaVA-Med-v1.0 (w/o LoRA, w/ extra data)
75.25
No
LLaVA-Med: Training a Large Language-and-Vision ...
2023-06-01
Code
4
Bunny-v1.0-3B (w/ LoRA, w/ extra data)
75.08
No
Efficient Multimodal Learning from Data-centric ...
2024-02-18
Code
5
LLaVA-Med-v1.0 (w/o LoRA, w/o extra data)
75.07
No
LLaVA-Med: Training a Large Language-and-Vision ...
2023-06-01
Code
6
MGM-2B (w/o LoRA, w/ extra data)
74.3
No
Mini-Gemini: Mining the Potential of Multi-modal...
2024-03-27
Code
7
MobileVLM-1.7B (w/o LoRA, w/ extra data)
73.14
No
MobileVLM : A Fast, Strong and Open Vision Langu...
2023-12-28
Code
8
LLaVA-Med-v1.5 (w/ LoRA, w/o extra data)
73.05
No
LLaVA-Med: Training a Large Language-and-Vision ...
2023-06-01
Code
9
LLaVA-v1.5 (w/ LoRA, w/ extra data)
72.88
No
Improved Baselines with Visual Instruction Tuning
2023-10-05
Code
10
MiniGPT-v2 (w/ LoRA, w/o extra data)
72.05
No
MiniGPT-v2: large language model as a unified in...
2023-10-14
Code
11
LLaVA-v1.5 (w/ LoRA, w/o extra data)
70.38
No
Improved Baselines with Visual Instruction Tuning
2023-10-05
Code
12
MiniGPT-v2 (w/ LoRA, w/ extra data)
70.23
No
MiniGPT-v2: large language model as a unified in...
2023-10-14
Code
13
LLaVA-Med-v1.5 (w/ LoRA, w/ extra data)
70
No
LLaVA-Med: Training a Large Language-and-Vision ...
2023-06-01
Code
14
MGM-2B (w/o LoRA, w/o extra data)
69.81
No
Mini-Gemini: Mining the Potential of Multi-modal...
2024-03-27
Code
15
Bunny-v1.0-3B (w/ LoRA, w/o extra data)
69.45
No
Efficient Multimodal Learning from Data-centric ...
2024-02-18
Code
16
LLaVA-v1 (w/ LoRA, w/o extra data)
68.11
No
Visual Instruction Tuning
2023-04-17
Code
17
LLaVA-v1 (w/ LoRA, w/ extra data)
46.85
No
Visual Instruction Tuning
2023-04-17
Code
#1
ColonGPT (w/ LoRA, w/o extra data)
SOTA
80.18
Accuray
· 2024-10-22
Frontiers in Intelligent Colonoscopy
Code
#2
MobileVLM-1.7B (w/ LoRA, w/ extra data)
SOTA
78.03
Accuray
· 2023-12-28
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
Code
#3
LLaVA-Med-v1.0 (w/o LoRA, w/ extra data)
SOTA
75.25
Accuray
· 2023-06-01
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
Code
#4
Bunny-v1.0-3B (w/ LoRA, w/ extra data)
75.08
Accuray
· 2024-02-18
Efficient Multimodal Learning from Data-centric Perspective
Code
#5
LLaVA-Med-v1.0 (w/o LoRA, w/o extra data)
75.07
Accuray
· 2023-06-01
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
Code
#6
MGM-2B (w/o LoRA, w/ extra data)
74.3
Accuray
· 2024-03-27
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Code
#7
MobileVLM-1.7B (w/o LoRA, w/ extra data)
73.14
Accuray
· 2023-12-28
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
Code
#8
LLaVA-Med-v1.5 (w/ LoRA, w/o extra data)
73.05
Accuray
· 2023-06-01
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
Code
#9
LLaVA-v1.5 (w/ LoRA, w/ extra data)
72.88
Accuray
· 2023-10-05
Improved Baselines with Visual Instruction Tuning
Code
#10
MiniGPT-v2 (w/ LoRA, w/o extra data)
72.05
Accuray
· 2023-10-14
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Code
#11
LLaVA-v1.5 (w/ LoRA, w/o extra data)
70.38
Accuray
· 2023-10-05
Improved Baselines with Visual Instruction Tuning
Code
#12
MiniGPT-v2 (w/ LoRA, w/ extra data)
70.23
Accuray
· 2023-10-14
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Code
#13
LLaVA-Med-v1.5 (w/ LoRA, w/ extra data)
70
Accuray
· 2023-06-01
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
Code
#14
MGM-2B (w/o LoRA, w/o extra data)
69.81
Accuray
· 2024-03-27
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Code
#15
Bunny-v1.0-3B (w/ LoRA, w/o extra data)
69.45
Accuray
· 2024-02-18
Efficient Multimodal Learning from Data-centric Perspective
Code
#16
LLaVA-v1 (w/ LoRA, w/o extra data)
SOTA
68.11
Accuray
· 2023-04-17
Visual Instruction Tuning
Code
#17
LLaVA-v1 (w/ LoRA, w/ extra data)
46.85
Accuray
· 2023-04-17
Visual Instruction Tuning
Code