Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Audio
/
10-shot image generation
/
DeLiVER
10-shot image generation on DeLiVER
Metric: mIoU (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
mIoU (best first)
mIoU (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
mIoU
▼
Extra Data
Paper
Date
↕
Code
1
CAFuser-CAA
68.6
No
CAFuser: Condition-Aware Multimodal Fusion for R...
2024-10-14
Code
2
StitchFusion(RGB-D-E-LiDAR)
68.18
No
StitchFusion: Weaving Any Visual Modalities to E...
2024-08-02
Code
3
GeminiFusion
66.9
No
GeminiFusion: Efficient Pixel-wise Multimodal Fu...
2024-06-03
Code
4
StitchFusion (RGB-D-LiDAR)
66.65
No
StitchFusion: Weaving Any Visual Modalities to E...
2024-08-02
Code
5
CMNeXt (RGB-D-E-LiDAR)
66.3
No
Delivering Arbitrary-Modal Semantic Segmentation
2023-03-02
Code
6
StitchFusion (RGB-D-Event)
66.03
No
StitchFusion: Weaving Any Visual Modalities to E...
2024-08-02
Code
7
StitchFusion (RGB-Depth)
65.75
No
StitchFusion: Weaving Any Visual Modalities to E...
2024-08-02
Code
8
MemorySAM-B+(R-D-E-L)
65.38
No
MemorySAM: Memorize Modalities and Semantics wit...
2025-03-09
Code
9
MemorySAM-B+(R-D)
63.48
No
MemorySAM: Memorize Modalities and Semantics wit...
2025-03-09
Code
10
CMX (RGB-Depth)
62.67
No
CMX: Cross-Modal Fusion for RGB-X Semantic Segme...
2022-03-09
Code
11
MemorySAM-B+(R-D-E)
62.42
No
MemorySAM: Memorize Modalities and Semantics wit...
2025-03-09
Code
12
TokenFusion (RGB-Depth)
60.25
No
Multimodal Token Fusion for Vision Transformers
2022-04-19
Code
13
StitchFusion (RGB-LiDAR)
58.03
No
StitchFusion: Weaving Any Visual Modalities to E...
2024-08-02
Code
14
StitchFusion (RGB-Event)
57.44
No
StitchFusion: Weaving Any Visual Modalities to E...
2024-08-02
Code
15
CMX (RGB-Event)
56.52
No
CMX: Cross-Modal Fusion for RGB-X Semantic Segme...
2022-03-09
Code
16
CMX (RGB-LiDAR)
56.37
No
CMX: Cross-Modal Fusion for RGB-X Semantic Segme...
2022-03-09
Code
17
MemorySAM-B+(RGB)
53.22
No
MemorySAM: Memorize Modalities and Semantics wit...
2025-03-09
Code
18
TokenFusion (RGB-LiDAR)
53.01
No
Multimodal Token Fusion for Vision Transformers
2022-04-19
Code
19
HRFuser (RGB-D-E-Li)
52.97
No
HRFuser: A Multi-resolution Sensor Fusion Archit...
2022-06-30
Code
20
HRFuser (RGB-D-LiDAR)
52.72
No
HRFuser: A Multi-resolution Sensor Fusion Archit...
2022-06-30
Code
21
HRFuser (RGB-Depth)
51.88
No
HRFuser: A Multi-resolution Sensor Fusion Archit...
2022-06-30
Code
22
HRFuser (RGB-D-Event)
51.83
No
HRFuser: A Multi-resolution Sensor Fusion Archit...
2022-06-30
Code
23
HRFuser (RGB)
47.95
No
HRFuser: A Multi-resolution Sensor Fusion Archit...
2022-06-30
Code
24
TokenFusion (RGB-Event)
45.63
No
Multimodal Token Fusion for Vision Transformers
2022-04-19
Code
25
HRFuser (RGB-LiDAR)
43.13
No
HRFuser: A Multi-resolution Sensor Fusion Archit...
2022-06-30
Code
26
HRFuser (RGB-Event)
42.22
No
HRFuser: A Multi-resolution Sensor Fusion Archit...
2022-06-30
Code
#1
CAFuser-CAA
SOTA
68.6
mIoU
· 2024-10-14
CAFuser: Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving Scenes
Code
#2
StitchFusion(RGB-D-E-LiDAR)
SOTA
68.18
mIoU
· 2024-08-02
StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation
Code
#3
GeminiFusion
SOTA
66.9
mIoU
· 2024-06-03
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Code
#4
StitchFusion (RGB-D-LiDAR)
66.65
mIoU
· 2024-08-02
StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation
Code
#5
CMNeXt (RGB-D-E-LiDAR)
SOTA
66.3
mIoU
· 2023-03-02
Delivering Arbitrary-Modal Semantic Segmentation
Code
#6
StitchFusion (RGB-D-Event)
66.03
mIoU
· 2024-08-02
StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation
Code
#7
StitchFusion (RGB-Depth)
65.75
mIoU
· 2024-08-02
StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation
Code
#8
MemorySAM-B+(R-D-E-L)
65.38
mIoU
· 2025-03-09
MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation
Code
#9
MemorySAM-B+(R-D)
63.48
mIoU
· 2025-03-09
MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation
Code
#10
CMX (RGB-Depth)
SOTA
62.67
mIoU
· 2022-03-09
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
Code
#11
MemorySAM-B+(R-D-E)
62.42
mIoU
· 2025-03-09
MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation
Code
#12
TokenFusion (RGB-Depth)
60.25
mIoU
· 2022-04-19
Multimodal Token Fusion for Vision Transformers
Code
#13
StitchFusion (RGB-LiDAR)
58.03
mIoU
· 2024-08-02
StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation
Code
#14
StitchFusion (RGB-Event)
57.44
mIoU
· 2024-08-02
StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation
Code
#15
CMX (RGB-Event)
56.52
mIoU
· 2022-03-09
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
Code
#16
CMX (RGB-LiDAR)
56.37
mIoU
· 2022-03-09
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
Code
#17
MemorySAM-B+(RGB)
53.22
mIoU
· 2025-03-09
MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation
Code
#18
TokenFusion (RGB-LiDAR)
53.01
mIoU
· 2022-04-19
Multimodal Token Fusion for Vision Transformers
Code
#19
HRFuser (RGB-D-E-Li)
52.97
mIoU
· 2022-06-30
HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
Code
#20
HRFuser (RGB-D-LiDAR)
52.72
mIoU
· 2022-06-30
HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
Code
#21
HRFuser (RGB-Depth)
51.88
mIoU
· 2022-06-30
HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
Code
#22
HRFuser (RGB-D-Event)
51.83
mIoU
· 2022-06-30
HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
Code
#23
HRFuser (RGB)
47.95
mIoU
· 2022-06-30
HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
Code
#24
TokenFusion (RGB-Event)
45.63
mIoU
· 2022-04-19
Multimodal Token Fusion for Vision Transformers
Code
#25
HRFuser (RGB-LiDAR)
43.13
mIoU
· 2022-06-30
HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
Code
#26
HRFuser (RGB-Event)
42.22
mIoU
· 2022-06-30
HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
Code