TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/VATEX

VATEX

Reported on 26 benchmarks across 4 tasks · 1 paper · 18 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision26 results

  • Instance SegmentationonRefCOCO testA
    mIoU· 2024-04-12
    79.64
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Instance SegmentationonRefCoCo val
    mIoU· 2024-04-12
    78.16
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Instance SegmentationonRefCOCO testB
    mIoU· 2024-04-12
    75.64
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Instance SegmentationonRefCOCOg-test
    mIoU· 2024-04-12
    70.58
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Instance SegmentationonRefCOCO+ test B
    mIoU· 2024-04-12
    62.52
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Instance SegmentationonDAVIS 2017 (val)
    J&F score· 2024-04-12
    65.4
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Instance SegmentationonRefCOCO+ testA
    mIoU· 2024-04-12
    74.41
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Instance SegmentationonRefCOCOg-val
    IoU· 2024-04-12
    0.7554
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Instance SegmentationonRefCOCOg-val
    mIoU· 2024-04-12
    69.73
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Referring Expression SegmentationonRefCOCO testA
    mIoU· 2024-04-12
    79.64
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Referring Expression SegmentationonRefCoCo val
    mIoU· 2024-04-12
    78.16
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Referring Expression SegmentationonRefCOCO testB
    mIoU· 2024-04-12
    75.64
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Referring Expression SegmentationonRefCOCOg-test
    mIoU· 2024-04-12
    70.58
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Referring Expression SegmentationonRefCOCO+ test B
    mIoU· 2024-04-12
    62.52
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Referring Expression SegmentationonDAVIS 2017 (val)
    J&F score· 2024-04-12
    65.4
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Referring Expression SegmentationonRefCOCO+ testA
    mIoU· 2024-04-12
    74.41
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Referring Expression SegmentationonRefCOCOg-val
    IoU· 2024-04-12
    0.7554
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Referring Expression SegmentationonRefCOCOg-val
    mIoU· 2024-04-12
    69.73
    SOTA
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • VideoonRefer-YouTube-VOS
    F· 2024-04-12
    67.5
    best: 75.7 (FindTrack)
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • VideoonRefer-YouTube-VOS
    J· 2024-04-12
    63.3
    best: 71.8 (FindTrack)
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • VideoonRefer-YouTube-VOS
    J&F· 2024-04-12
    65.4
    best: 73.7 (FindTrack)
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Instance SegmentationonRefCOCO+ val
    Mean IoU· 2024-04-12
    70.02
    best: 81.28 (DeRIS-L)
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Video Object SegmentationonRefer-YouTube-VOS
    F· 2024-04-12
    67.5
    best: 75.7 (FindTrack)
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Video Object SegmentationonRefer-YouTube-VOS
    J· 2024-04-12
    63.3
    best: 71.8 (FindTrack)
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Video Object SegmentationonRefer-YouTube-VOS
    J&F· 2024-04-12
    65.4
    best: 73.7 (FindTrack)
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590
  • Referring Expression SegmentationonRefCOCO+ val
    Mean IoU· 2024-04-12
    70.02
    best: 81.28 (DeRIS-L)
    Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context UnderstandingarXiv:2404.08590