TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/DVIS(Swin-L)

DVIS(Swin-L)

Reported on 16 benchmarks across 4 tasks · 1 paper · 14 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision12 results

  • Video Instance SegmentationonYouTube-VIS 2021
    AP50· uses extra data· 2023-06-06
    83
    best: 87.3 (CAVIS(VIT-L, Offline))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • Video Instance SegmentationonYouTube-VIS 2021
    AP75· uses extra data· 2023-06-06
    68.4
    best: 73.2 (CAVIS(VIT-L, Offline))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • Video Instance SegmentationonYouTube-VIS 2021
    AR10· uses extra data· 2023-06-06
    65.7
    best: 70.7 (DVIS-DAQ(VIT-L, Offline))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • Video Instance SegmentationonYoutube-VIS 2022 Validation
    AP50_L· uses extra data· 2023-06-06
    69
    best: 75.7 (DVIS++(VIT-L))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • Video Instance SegmentationonYoutube-VIS 2022 Validation
    AP75_L· uses extra data· 2023-06-06
    48.8
    best: 52.8 (DVIS++(VIT-L))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • Video Instance SegmentationonYoutube-VIS 2022 Validation
    AR10_L· uses extra data· 2023-06-06
    51.8
    best: 55.8 (DVIS++(VIT-L))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • Video Instance SegmentationonYoutube-VIS 2022 Validation
    AR1_L· uses extra data· 2023-06-06
    37.2
    best: 40.6 (DVIS++(VIT-L))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • Video Instance SegmentationonYoutube-VIS 2022 Validation
    mAP_L· uses extra data· 2023-06-06
    45.9
    best: 50.9 (DVIS++(VIT-L))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • Panoptic SegmentationonVIPSeg
    STQ· 2023-06-06
    55.3
    best: 58.2 (UniVS(Swin-L))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • Panoptic SegmentationonVIPSeg
    VPQ· 2023-06-06
    57.6
    best: 58.5 (CAVIS(VIT-L))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • Video Instance SegmentationonYouTube-VIS 2021
    AR1· uses extra data· 2023-06-06
    47.7
    best: 49.7 (CAVIS(VIT-L, Offline))
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • Video Instance SegmentationonYouTube-VIS 2021
    mask AP· uses extra data· 2023-06-06
    60.1
    best: 65.3 (CAVIS(VIT-L, Offline))
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413

Medical2 results

  • Semantic SegmentationonVIPSeg
    STQ· 2023-06-06
    55.3
    best: 58.2 (UniVS(Swin-L))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • Semantic SegmentationonVIPSeg
    VPQ· 2023-06-06
    57.6
    best: 58.5 (CAVIS(VIT-L))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413

Audio2 results

  • 10-shot image generationonVIPSeg
    STQ· 2023-06-06
    55.3
    best: 58.2 (UniVS(Swin-L))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413
  • 10-shot image generationonVIPSeg
    VPQ· 2023-06-06
    57.6
    best: 58.5 (CAVIS(VIT-L))
    SOTA
    DVIS: Decoupled Video Instance Segmentation FrameworkarXiv:2306.03413