ViT-P (OneFormer, DiNAT-L, single-scale, 1280x1280, COCO_pretrain)
Reported on 4 benchmarks across 4 tasks · 1 paper
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Computer Vision2 results
- AP· uses extra data· 2025-05-2640.7best: 44.2 (OneFormer (InternImage-H, emb_dim=1024, single-scale, 896x896, COCO-Pretrained))
- PQ· uses extra data· 2025-05-2654best: 54.5 (OneFormer (InternImage-H, emb_dim=256, single-scale, 896x896))
Medical1 result
- PQ· uses extra data· 2025-05-2654best: 54.5 (OneFormer (InternImage-H, emb_dim=256, single-scale, 896x896))
Audio1 result
- PQ· uses extra data· 2025-05-2654best: 54.5 (OneFormer (InternImage-H, emb_dim=256, single-scale, 896x896))