TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/ViTDet-L

ViTDet-L

Reported on 7 benchmarks across 6 tasks · 1 paper

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Methodology4 results

  • 3DonLVIS v1.0 val
    box AP· 2022-03-30
    51.2
    best: 68 (Co-DETR (single-scale))
    Exploring Plain Vision Transformer Backbones for Object DetectionarXiv:2203.16527
  • 2D ClassificationonLVIS v1.0 val
    box AP· 2022-03-30
    51.2
    best: 68 (Co-DETR (single-scale))
    Exploring Plain Vision Transformer Backbones for Object DetectionarXiv:2203.16527
  • 2D Object DetectiononLVIS v1.0 val
    box AP· 2022-03-30
    51.2
    best: 68 (Co-DETR (single-scale))
    Exploring Plain Vision Transformer Backbones for Object DetectionarXiv:2203.16527
  • 16konLVIS v1.0 val
    box AP· 2022-03-30
    51.2
    best: 68 (Co-DETR (single-scale))
    Exploring Plain Vision Transformer Backbones for Object DetectionarXiv:2203.16527

Computer Vision3 results

  • Object DetectiononLVIS v1.0 val
    box AP· 2022-03-30
    51.2
    best: 68 (Co-DETR (single-scale))
    Exploring Plain Vision Transformer Backbones for Object DetectionarXiv:2203.16527
  • Instance SegmentationonLVIS v1.0 val
    mask AP· 2022-03-30
    46
    best: 60.7 (Co-DETR (single-scale))
    Exploring Plain Vision Transformer Backbones for Object DetectionarXiv:2203.16527
  • Instance SegmentationonLVIS v1.0 val
    mask APr· 2022-03-30
    34.3
    best: 45.8 (DiverGen (Swin-L))
    Exploring Plain Vision Transformer Backbones for Object DetectionarXiv:2203.16527