TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Audio/10-shot image generation/COCO minival

10-shot image generation on COCO minival

Metric: AP (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕AP▼Extra DataPaperDate↕Code
1OpenSeeD (SwinL, single-scale)53.2YesA Simple Framework for Open-Vocabulary Segmentat...2023-03-14Code
2OneFormer (InternImage-H,single-scale)52NoOneFormer: One Transformer to Rule Universal Ima...2022-11-10Code
3MasK DINO (SwinL,single-scale)50.9YesMask DINO: Towards A Unified Transformer-based F...2022-06-06Code
4UMG-CLIP-E/1450.7YesUMG-CLIP: A Unified Multi-Granularity Vision Gen...2024-01-12Code
5UMG-CLIP-L/1449.7YesUMG-CLIP: A Unified Multi-Granularity Vision Gen...2024-01-12Code
6DiNAT-L (single-scale, Mask2Former)49.2NoDilated Neighborhood Attention Transformer2022-09-29Code
7OneFormer (DiNAT-L, single-scale)49.2NoOneFormer: One Transformer to Rule Universal Ima...2022-11-10Code
8OneFormer (Swin-L, single-scale)49NoOneFormer: One Transformer to Rule Universal Ima...2022-11-10Code
9ViT-Adapter-L (single-scale, BEiTv2 pretrain, Mask2Former)48.9NoVision Transformer Adapter for Dense Predictions2022-05-17Code
10Mask2Former (single-scale)48.6NoMasked-attention Mask Transformer for Universal ...2021-12-02Code
11FocalNet-L (Mask2Former (200 queries))48.4NoFocal Modulation Networks2022-03-22Code
12PanopticFPN++39.7NoEnd-to-End Object Detection with Transformers2020-05-26Code
13DETR-R101 (ResNet-101)33NoEnd-to-End Object Detection with Transformers2020-05-26Code