ViT-P (OneFormer, DiNAT-L, single-scale, 1280x1280)
Reported on 4 benchmarks across 4 tasks · 1 paper
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Computer Vision2 results
- AP· 2025-05-2637.8best: 44.2 (OneFormer (InternImage-H, emb_dim=1024, single-scale, 896x896, COCO-Pretrained))
- PQ· 2025-05-2651.9best: 54.5 (OneFormer (InternImage-H, emb_dim=256, single-scale, 896x896))
Medical1 result
- PQ· 2025-05-2651.9best: 54.5 (OneFormer (InternImage-H, emb_dim=256, single-scale, 896x896))
Audio1 result
- PQ· 2025-05-2651.9best: 54.5 (OneFormer (InternImage-H, emb_dim=256, single-scale, 896x896))