Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Audio
/
10-shot image generation
/
COCO minival
10-shot image generation on COCO minival
Metric: AP (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
AP
▼
Extra Data
Paper
Date
↕
Code
1
OpenSeeD (SwinL, single-scale)
53.2
Yes
A Simple Framework for Open-Vocabulary Segmentat...
2023-03-14
Code
2
OneFormer (InternImage-H,single-scale)
52
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
3
MasK DINO (SwinL,single-scale)
50.9
Yes
Mask DINO: Towards A Unified Transformer-based F...
2022-06-06
Code
4
UMG-CLIP-E/14
50.7
Yes
UMG-CLIP: A Unified Multi-Granularity Vision Gen...
2024-01-12
Code
5
UMG-CLIP-L/14
49.7
Yes
UMG-CLIP: A Unified Multi-Granularity Vision Gen...
2024-01-12
Code
6
DiNAT-L (single-scale, Mask2Former)
49.2
No
Dilated Neighborhood Attention Transformer
2022-09-29
Code
7
OneFormer (DiNAT-L, single-scale)
49.2
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
8
OneFormer (Swin-L, single-scale)
49
No
OneFormer: One Transformer to Rule Universal Ima...
2022-11-10
Code
9
ViT-Adapter-L (single-scale, BEiTv2 pretrain, Mask2Former)
48.9
No
Vision Transformer Adapter for Dense Predictions
2022-05-17
Code
10
Mask2Former (single-scale)
48.6
No
Masked-attention Mask Transformer for Universal ...
2021-12-02
Code
11
FocalNet-L (Mask2Former (200 queries))
48.4
No
Focal Modulation Networks
2022-03-22
Code
12
PanopticFPN++
39.7
No
End-to-End Object Detection with Transformers
2020-05-26
Code
13
DETR-R101 (ResNet-101)
33
No
End-to-End Object Detection with Transformers
2020-05-26
Code