TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/GLIPv2

GLIPv2

Reported on 13 benchmarks across 8 tasks · 1 paper · 13 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Methodology8 results

  • 3DonLVIS v1.0 minival
    box AP· uses extra data· 2022-06-12
    59.8
    best: 72 (Co-DETR (single-scale))
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836
  • 3DonODinW Full-Shot 13 Tasks
    AP· 2022-06-12
    70.4
    best: 73.1 (CP-DETR-L(only optimize prompt))
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836
  • 2D ClassificationonLVIS v1.0 minival
    box AP· uses extra data· 2022-06-12
    59.8
    best: 72 (Co-DETR (single-scale))
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836
  • 2D ClassificationonODinW Full-Shot 13 Tasks
    AP· 2022-06-12
    70.4
    best: 73.1 (CP-DETR-L(only optimize prompt))
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836
  • 2D Object DetectiononLVIS v1.0 minival
    box AP· uses extra data· 2022-06-12
    59.8
    best: 72 (Co-DETR (single-scale))
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836
  • 2D Object DetectiononODinW Full-Shot 13 Tasks
    AP· 2022-06-12
    70.4
    best: 73.1 (CP-DETR-L(only optimize prompt))
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836
  • 16konLVIS v1.0 minival
    box AP· uses extra data· 2022-06-12
    59.8
    best: 72 (Co-DETR (single-scale))
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836
  • 16konODinW Full-Shot 13 Tasks
    AP· 2022-06-12
    70.4
    best: 73.1 (CP-DETR-L(only optimize prompt))
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836

Computer Vision4 results

  • Object DetectiononLVIS v1.0 minival
    box AP· uses extra data· 2022-06-12
    59.8
    best: 72 (Co-DETR (single-scale))
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836
  • Object DetectiononODinW Full-Shot 13 Tasks
    AP· 2022-06-12
    70.4
    best: 73.1 (CP-DETR-L(only optimize prompt))
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836
  • Instance SegmentationonPhraseCut
    Mean IoU· 2022-06-12
    61.3
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836
  • Referring Expression SegmentationonPhraseCut
    Mean IoU· 2022-06-12
    61.3
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836

Natural Language Processing1 result

  • Phrase GroundingonFlickr30k Entities Test
    R@1· uses extra data· 2022-06-12
    87.7
    SOTA
    GLIPv2: Unifying Localization and Vision-Language UnderstandingarXiv:2206.05836