TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/SST

SST

Reported on 45 benchmarks across 10 tasks · 4 papers · 21 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision30 results

  • Multi-Label Image ClassificationonMS-COCO-2014
    Average mAP· 2021-12-21
    76.7
    best: 83.6 (DualCoOp+TaI-DPT)
    SOTA
    Structured Semantic Transfer for Multi-Label Recognition with Partial LabelsarXiv:2112.10941
  • Multi-Label Image ClassificationonPASCAL VOC 2007
    Average mAP· 2021-12-21
    90.4
    best: 94.8 (DualCoOp+TaI-DPT)
    SOTA
    Structured Semantic Transfer for Multi-Label Recognition with Partial LabelsarXiv:2112.10941
  • Multi-Label Image ClassificationonVisual Genome
    Average mAP· 2021-12-21
    41.8
    best: 46 (DSRB)
    SOTA
    Structured Semantic Transfer for Multi-Label Recognition with Partial LabelsarXiv:2112.10941
  • Image ClassificationonMS-COCO-2014
    Average mAP· 2021-12-21
    76.7
    best: 83.6 (DualCoOp+TaI-DPT)
    SOTA
    Structured Semantic Transfer for Multi-Label Recognition with Partial LabelsarXiv:2112.10941
  • Image ClassificationonPASCAL VOC 2007
    Average mAP· 2021-12-21
    90.4
    best: 94.8 (DualCoOp+TaI-DPT)
    SOTA
    Structured Semantic Transfer for Multi-Label Recognition with Partial LabelsarXiv:2112.10941
  • Image ClassificationonVisual Genome
    Average mAP· 2021-12-21
    41.8
    best: 46 (DSRB)
    SOTA
    Structured Semantic Transfer for Multi-Label Recognition with Partial LabelsarXiv:2112.10941
  • Object Detectiononwaymo cyclist
    APH/L2· 2021-12-13
    72.17
    best: 78 (DSVT(val))
    SOTA
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • Object Detectiononwaymo pedestrian
    APH/L2· 2021-12-13
    73.51
    best: 76.4 (DSVT(val))
    SOTA
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 3D Object Detectiononwaymo cyclist
    APH/L2· 2021-12-13
    72.17
    best: 78 (DSVT(val))
    SOTA
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 3D Object Detectiononwaymo pedestrian
    APH/L2· 2021-12-13
    73.51
    best: 76.4 (DSVT(val))
    SOTA
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • VideoonYouTube-VOS 2018
    Jaccard (Seen)· 2022-08-01
    81.2
    best: 86.6 (Cutie+ (base, MEGA))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • VideoonYouTube-VOS 2018
    Jaccard (Unseen)· 2022-08-01
    76
    best: 82.2 (Cutie+ (base, MEGA))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • VideoonYouTube-VOS 2018
    Mean Jaccard & F-Measure· 2022-08-01
    81.7
    best: 86.9 (XMem (BL30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • VideoonDAVIS 2017 (val)
    F-measure· 2022-08-01
    85.1
    best: 92.6 (XMem (BLK30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • VideoonDAVIS 2017 (val)
    Jaccard· 2022-08-01
    79.9
    best: 86.3 (XMem (BLK30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • VideoonDAVIS 2017 (val)
    Mean Jaccard & F-Measure· 2022-08-01
    82.5
    best: 89.5 (XMem (BLK30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonYouTube-VOS 2018
    Jaccard (Seen)· 2022-08-01
    81.2
    best: 86.6 (Cutie+ (base, MEGA))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonYouTube-VOS 2018
    Jaccard (Unseen)· 2022-08-01
    76
    best: 82.2 (Cutie+ (base, MEGA))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonYouTube-VOS 2018
    Mean Jaccard & F-Measure· 2022-08-01
    81.7
    best: 86.9 (XMem (BL30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonDAVIS 2017 (val)
    F-measure· 2022-08-01
    85.1
    best: 92.6 (XMem (BLK30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonDAVIS 2017 (val)
    Jaccard· 2022-08-01
    79.9
    best: 86.3 (XMem (BLK30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Video Object SegmentationonDAVIS 2017 (val)
    Mean Jaccard & F-Measure· 2022-08-01
    82.5
    best: 89.5 (XMem (BLK30K, MS))
    BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object SegmentationarXiv:2208.01159
  • Object Detectiononwaymo vehicle
    APH/L2· 2021-12-13
    72.74
    best: 75.76 (PillarNeXt)
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 3D Object Detectiononwaymo vehicle
    APH/L2· 2021-12-13
    72.74
    best: 75.76 (PillarNeXt)
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • VideoonYouTube-VOS 2019
    Jaccard (Seen)· 2021-01-21
    80.9
    best: 86.3 (Cutie+ (base, MEGA))
    SSTVOS: Sparse Spatiotemporal Transformers for Video Object SegmentationarXiv:2101.08833
  • VideoonYouTube-VOS 2019
    Jaccard (Unseen)· 2021-01-21
    76.6
    best: 754.8 (R50-AOST (L'=1))
    SSTVOS: Sparse Spatiotemporal Transformers for Video Object SegmentationarXiv:2101.08833
  • VideoonYouTube-VOS 2019
    Mean Jaccard & F-Measure· 2021-01-21
    81.8
    best: 86.8 (XMem (BL30K,MS))
    SSTVOS: Sparse Spatiotemporal Transformers for Video Object SegmentationarXiv:2101.08833
  • Video Object SegmentationonYouTube-VOS 2019
    Jaccard (Seen)· 2021-01-21
    80.9
    best: 86.3 (Cutie+ (base, MEGA))
    SSTVOS: Sparse Spatiotemporal Transformers for Video Object SegmentationarXiv:2101.08833
  • Video Object SegmentationonYouTube-VOS 2019
    Jaccard (Unseen)· 2021-01-21
    76.6
    best: 754.8 (R50-AOST (L'=1))
    SSTVOS: Sparse Spatiotemporal Transformers for Video Object SegmentationarXiv:2101.08833
  • Video Object SegmentationonYouTube-VOS 2019
    Mean Jaccard & F-Measure· 2021-01-21
    81.8
    best: 86.8 (XMem (BL30K,MS))
    SSTVOS: Sparse Spatiotemporal Transformers for Video Object SegmentationarXiv:2101.08833

Methodology15 results

  • 2D ClassificationonMS-COCO-2014
    Average mAP· 2021-12-21
    76.7
    best: 83.6 (DualCoOp+TaI-DPT)
    SOTA
    Structured Semantic Transfer for Multi-Label Recognition with Partial LabelsarXiv:2112.10941
  • 2D ClassificationonPASCAL VOC 2007
    Average mAP· 2021-12-21
    90.4
    best: 94.8 (DualCoOp+TaI-DPT)
    SOTA
    Structured Semantic Transfer for Multi-Label Recognition with Partial LabelsarXiv:2112.10941
  • 2D ClassificationonVisual Genome
    Average mAP· 2021-12-21
    41.8
    best: 46 (DSRB)
    SOTA
    Structured Semantic Transfer for Multi-Label Recognition with Partial LabelsarXiv:2112.10941
  • 3Donwaymo cyclist
    APH/L2· 2021-12-13
    72.17
    best: 78 (DSVT(val))
    SOTA
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 3Donwaymo pedestrian
    APH/L2· 2021-12-13
    73.51
    best: 76.4 (DSVT(val))
    SOTA
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 2D Classificationonwaymo cyclist
    APH/L2· 2021-12-13
    72.17
    best: 78 (DSVT(val))
    SOTA
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 2D Classificationonwaymo pedestrian
    APH/L2· 2021-12-13
    73.51
    best: 76.4 (DSVT(val))
    SOTA
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 2D Object Detectiononwaymo cyclist
    APH/L2· 2021-12-13
    72.17
    best: 78 (DSVT(val))
    SOTA
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 2D Object Detectiononwaymo pedestrian
    APH/L2· 2021-12-13
    73.51
    best: 76.4 (DSVT(val))
    SOTA
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 16konwaymo cyclist
    APH/L2· 2021-12-13
    72.17
    best: 78 (DSVT(val))
    SOTA
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 16konwaymo pedestrian
    APH/L2· 2021-12-13
    73.51
    best: 76.4 (DSVT(val))
    SOTA
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 3Donwaymo vehicle
    APH/L2· 2021-12-13
    72.74
    best: 75.76 (PillarNeXt)
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 2D Classificationonwaymo vehicle
    APH/L2· 2021-12-13
    72.74
    best: 75.76 (PillarNeXt)
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 2D Object Detectiononwaymo vehicle
    APH/L2· 2021-12-13
    72.74
    best: 75.76 (PillarNeXt)
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375
  • 16konwaymo vehicle
    APH/L2· 2021-12-13
    72.74
    best: 75.76 (PillarNeXt)
    Embracing Single Stride 3D Object Detector with Sparse TransformerarXiv:2112.06375