TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/SAN

SAN

Reported on 101 benchmarks across 20 tasks · 6 papers · 30 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision49 results

  • Zero Shot SegmentationonSegmentation in the Wild
    Mean AP· 2023-02-23
    41.4
    best: 49.6 (Grounded HQ-SAM)
    SOTA
    Side Adapter Network for Open-Vocabulary Semantic SegmentationarXiv:2302.12242
  • Open Vocabulary Semantic SegmentationonADE20K-847
    mIoU· 2023-02-23
    13.7
    best: 17.3 (UMG-CLIP-E/14)
    SOTA
    Side Adapter Network for Open-Vocabulary Semantic SegmentationarXiv:2302.12242
  • Unsupervised Semantic SegmentationonCOCO-Stuff-3
    Pixel Accuracy· 2022-11-26
    80.3
    SOTA
    Rethinking Alignment and Uniformity in Unsupervised Semantic SegmentationarXiv:2211.14513
  • Handwritten Mathmatical Expression RecognitiononCROHME 2016
    ExpRate· 2022-03-03
    53.6
    best: 77.94 (Uni-MuMER)
    SOTA
    Syntax-Aware Network for Handwritten Mathematical Expression RecognitionarXiv:2203.01601
  • Handwritten Mathmatical Expression RecognitiononHME100K
    ExpRate· 2022-03-03
    67.1
    best: 69.51 (PosFormer)
    SOTA
    Syntax-Aware Network for Handwritten Mathematical Expression RecognitionarXiv:2203.01601
  • Facial Landmark DetectiononAFLW-Front
    Mean NME · 2018-03-12
    1.85
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Facial Landmark DetectiononAFLW-Full
    Mean NME · 2018-03-12
    1.91
    best: 2.17 (DCFE (Box height Norm, 19 landmarks - no earlobs))
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Face ReconstructiononAFLW-19
    AUC_box@0.07 (%, Full)· 2018-03-12
    54
    best: 81.8 (FiFA)
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Face ReconstructiononAFLW-19
    NME_box (%, Full)· 2018-03-12
    4.04
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Face ReconstructiononAFLW-19
    NME_diag (%, Frontal)· 2018-03-12
    1.85
    best: 2.68 (CFSS)
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Face ReconstructiononAFLW-Front
    Mean NME · 2018-03-12
    1.85
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face ReconstructiononAFLW-19
    AUC_box@0.07 (%, Full)· 2018-03-12
    54
    best: 81.8 (FiFA)
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face ReconstructiononAFLW-19
    NME_box (%, Full)· 2018-03-12
    4.04
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face ReconstructiononAFLW-19
    NME_diag (%, Frontal)· 2018-03-12
    1.85
    best: 2.68 (CFSS)
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face ReconstructiononAFLW-Front
    Mean NME · 2018-03-12
    1.85
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Universal Domain AdaptationonDomainNet
    H-Score· 2022-11-26
    52
    best: 73.28 (TASC)
    Rethinking Alignment and Uniformity in Unsupervised Semantic SegmentationarXiv:2211.14513
  • Unsupervised Semantic SegmentationonCOCO-Stuff-27
    Clustering [Accuracy]· 2022-11-26
    52
    best: 81.1 (DynaSeg - FSF (ResNet-18 FPN))
    Rethinking Alignment and Uniformity in Unsupervised Semantic SegmentationarXiv:2211.14513
  • Handwritten Mathmatical Expression RecognitiononCROHME 2019
    ExpRate· 2022-03-03
    53.5
    best: 79.23 (Uni-MuMER)
    Syntax-Aware Network for Handwritten Mathematical Expression RecognitionarXiv:2203.01601
  • Handwritten Mathmatical Expression RecognitiononCROHME 2014
    ExpRate· 2022-03-03
    56.2
    best: 82.05 (Uni-MuMER)
    Syntax-Aware Network for Handwritten Mathematical Expression RecognitionarXiv:2203.01601
  • Sign Language RecognitiononRWTH-PHOENIX-Weather 2014
    Word Error Rate (WER)· 2021-01-12
    29.7
    best: 18.3 (SlowFastSign)
    Context Matters: Self-Attention for Sign Language RecognitionarXiv:2101.04632
  • Face Reconstructionon300W
    NME_inter-ocular (%, Challenge)· 2018-03-12
    6.6
    best: 8.2 (ASMNet)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Face Reconstructionon300W
    NME_inter-ocular (%, Common)· 2018-03-12
    3.34
    best: 5.09 (3DDFA)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Face Reconstructionon300W
    NME_inter-ocular (%, Full)· 2018-03-12
    3.98
    best: 5.63 (3DDFA)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Face ReconstructiononAFLW-19
    NME_diag (%, Full)· 2018-03-12
    1.91
    best: 3.92 (CFSS)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Face ReconstructiononAFLW-Full
    Mean NME · 2018-03-12
    1.91
    best: 2.85 (Binary Face Alignment)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face ReconstructiononAFLW-19
    NME_diag (%, Full)· 2018-03-12
    1.91
    best: 3.92 (CFSS)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face Reconstructionon300W
    NME_inter-ocular (%, Challenge)· 2018-03-12
    6.6
    best: 8.2 (ASMNet)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face Reconstructionon300W
    NME_inter-ocular (%, Common)· 2018-03-12
    3.34
    best: 5.09 (3DDFA)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face Reconstructionon300W
    NME_inter-ocular (%, Full)· 2018-03-12
    3.98
    best: 5.63 (3DDFA)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face ReconstructiononAFLW-Full
    Mean NME · 2018-03-12
    1.91
    best: 2.85 (Binary Face Alignment)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Image Super-ResolutiononSet14 - 4x upscaling
    PSNR
    29.05
    best: 29.54 (DRCT-L)
  • Image Super-ResolutiononSet14 - 4x upscaling
    SSIM
    0.7921
    best: 0.894 (Edge-informed SR)
  • Image Super-ResolutiononManga109 - 4x upscaling
    PSNR
    31.66
    best: 33.19 (HMA†)
  • Image Super-ResolutiononManga109 - 4x upscaling
    SSIM
    0.9222
    best: 0.9366 (Hi-IR-L)
  • Image Super-ResolutiononUrban100 - 4x upscaling
    PSNR
    27.23
    best: 28.72 (Hi-IR-L)
  • Image Super-ResolutiononUrban100 - 4x upscaling
    SSIM
    0.8169
    best: 0.9481 (SPSR)
  • Image Super-ResolutiononBSD100 - 4x upscaling
    PSNR
    27.86
    best: 28.16 (DRCT-L)
  • Image Super-ResolutiononBSD100 - 4x upscaling
    SSIM
    0.7457
    best: 0.851 (Edge-informed SR)
  • Universal Domain AdaptationonOffice-31
    H-score
    91.8
    best: 95.95 (UniAM)
  • Universal Domain AdaptationonOffice-Home
    H-Score
    75.9
    best: 89.44 (TASC)
  • Universal Domain AdaptationonVisDA2017
    H-score
    60.1
    best: 90.36 (TASC)
  • 3D Object Super-ResolutiononSet14 - 4x upscaling
    PSNR
    29.05
    best: 29.54 (DRCT-L)
  • 3D Object Super-ResolutiononSet14 - 4x upscaling
    SSIM
    0.7921
    best: 0.894 (Edge-informed SR)
  • 3D Object Super-ResolutiononManga109 - 4x upscaling
    PSNR
    31.66
    best: 33.19 (HMA†)
  • 3D Object Super-ResolutiononManga109 - 4x upscaling
    SSIM
    0.9222
    best: 0.9366 (Hi-IR-L)
  • 3D Object Super-ResolutiononUrban100 - 4x upscaling
    PSNR
    27.23
    best: 28.72 (Hi-IR-L)
  • 3D Object Super-ResolutiononUrban100 - 4x upscaling
    SSIM
    0.8169
    best: 0.9481 (SPSR)
  • 3D Object Super-ResolutiononBSD100 - 4x upscaling
    PSNR
    27.86
    best: 28.16 (DRCT-L)
  • 3D Object Super-ResolutiononBSD100 - 4x upscaling
    SSIM
    0.7457
    best: 0.851 (Edge-informed SR)

Methodology21 results

  • 3DonAFLW-19
    AUC_box@0.07 (%, Full)· 2018-03-12
    54
    best: 81.8 (FiFA)
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3DonAFLW-19
    NME_box (%, Full)· 2018-03-12
    4.04
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3DonAFLW-19
    NME_diag (%, Frontal)· 2018-03-12
    1.85
    best: 2.68 (CFSS)
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3DonAFLW-Front
    Mean NME · 2018-03-12
    1.85
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Domain AdaptationonDomainNet
    H-Score· 2022-11-26
    52
    best: 73.28 (TASC)
    Rethinking Alignment and Uniformity in Unsupervised Semantic SegmentationarXiv:2211.14513
  • 3Don300W
    NME_inter-ocular (%, Challenge)· 2018-03-12
    6.6
    best: 8.2 (ASMNet)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3Don300W
    NME_inter-ocular (%, Common)· 2018-03-12
    3.34
    best: 5.09 (3DDFA)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3Don300W
    NME_inter-ocular (%, Full)· 2018-03-12
    3.98
    best: 5.63 (3DDFA)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3DonAFLW-19
    NME_diag (%, Full)· 2018-03-12
    1.91
    best: 3.92 (CFSS)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3DonAFLW-Full
    Mean NME · 2018-03-12
    1.91
    best: 2.85 (Binary Face Alignment)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Domain AdaptationonOffice-31
    H-score
    91.8
    best: 95.95 (UniAM)
  • Domain AdaptationonOffice-Home
    H-Score
    75.9
    best: 89.44 (TASC)
  • Domain AdaptationonVisDA2017
    H-score
    60.1
    best: 90.36 (TASC)
  • 16konSet14 - 4x upscaling
    PSNR
    29.05
    best: 29.54 (DRCT-L)
  • 16konSet14 - 4x upscaling
    SSIM
    0.7921
    best: 0.894 (Edge-informed SR)
  • 16konManga109 - 4x upscaling
    PSNR
    31.66
    best: 33.19 (HMA†)
  • 16konManga109 - 4x upscaling
    SSIM
    0.9222
    best: 0.9366 (Hi-IR-L)
  • 16konUrban100 - 4x upscaling
    PSNR
    27.23
    best: 28.72 (Hi-IR-L)
  • 16konUrban100 - 4x upscaling
    SSIM
    0.8169
    best: 0.9481 (SPSR)
  • 16konBSD100 - 4x upscaling
    PSNR
    27.86
    best: 28.16 (DRCT-L)
  • 16konBSD100 - 4x upscaling
    SSIM
    0.7457
    best: 0.851 (Edge-informed SR)

Medical11 results

  • Semantic SegmentationonCOCO-Stuff-3
    Pixel Accuracy· 2022-11-26
    80.3
    SOTA
    Rethinking Alignment and Uniformity in Unsupervised Semantic SegmentationarXiv:2211.14513
  • 3D Face ModellingonAFLW-19
    AUC_box@0.07 (%, Full)· 2018-03-12
    54
    best: 81.8 (FiFA)
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face ModellingonAFLW-19
    NME_box (%, Full)· 2018-03-12
    4.04
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face ModellingonAFLW-19
    NME_diag (%, Frontal)· 2018-03-12
    1.85
    best: 2.68 (CFSS)
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face ModellingonAFLW-Front
    Mean NME · 2018-03-12
    1.85
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Semantic SegmentationonCOCO-Stuff-27
    Clustering [Accuracy]· 2022-11-26
    52
    best: 81.1 (DynaSeg - FSF (ResNet-18 FPN))
    Rethinking Alignment and Uniformity in Unsupervised Semantic SegmentationarXiv:2211.14513
  • 3D Face ModellingonAFLW-19
    NME_diag (%, Full)· 2018-03-12
    1.91
    best: 3.92 (CFSS)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face Modellingon300W
    NME_inter-ocular (%, Challenge)· 2018-03-12
    6.6
    best: 8.2 (ASMNet)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face Modellingon300W
    NME_inter-ocular (%, Common)· 2018-03-12
    3.34
    best: 5.09 (3DDFA)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face Modellingon300W
    NME_inter-ocular (%, Full)· 2018-03-12
    3.98
    best: 5.63 (3DDFA)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • 3D Face ModellingonAFLW-Full
    Mean NME · 2018-03-12
    1.91
    best: 2.85 (Binary Face Alignment)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108

Music9 results

  • Facial Recognition and ModellingonAFLW-19
    AUC_box@0.07 (%, Full)· 2018-03-12
    54
    best: 81.8 (FiFA)
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Facial Recognition and ModellingonAFLW-19
    NME_box (%, Full)· 2018-03-12
    4.04
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Facial Recognition and ModellingonAFLW-19
    NME_diag (%, Frontal)· 2018-03-12
    1.85
    best: 2.68 (CFSS)
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Facial Recognition and ModellingonAFLW-Front
    Mean NME · 2018-03-12
    1.85
    SOTA
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Facial Recognition and ModellingonAFLW-19
    NME_diag (%, Full)· 2018-03-12
    1.91
    best: 3.92 (CFSS)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Facial Recognition and Modellingon300W
    NME_inter-ocular (%, Challenge)· 2018-03-12
    6.6
    best: 8.2 (ASMNet)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Facial Recognition and Modellingon300W
    NME_inter-ocular (%, Common)· 2018-03-12
    3.34
    best: 5.09 (3DDFA)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Facial Recognition and Modellingon300W
    NME_inter-ocular (%, Full)· 2018-03-12
    3.98
    best: 5.63 (3DDFA)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108
  • Facial Recognition and ModellingonAFLW-Full
    Mean NME · 2018-03-12
    1.91
    best: 2.85 (Binary Face Alignment)
    Style Aggregated Network for Facial Landmark DetectionarXiv:1803.04108

Graphs8 results

  • Super-ResolutiononSet14 - 4x upscaling
    PSNR
    29.05
    best: 29.54 (DRCT-L)
  • Super-ResolutiononSet14 - 4x upscaling
    SSIM
    0.7921
    best: 0.894 (Edge-informed SR)
  • Super-ResolutiononManga109 - 4x upscaling
    PSNR
    31.66
    best: 33.19 (HMA†)
  • Super-ResolutiononManga109 - 4x upscaling
    SSIM
    0.9222
    best: 0.9366 (Hi-IR-L)
  • Super-ResolutiononUrban100 - 4x upscaling
    PSNR
    27.23
    best: 28.72 (Hi-IR-L)
  • Super-ResolutiononUrban100 - 4x upscaling
    SSIM
    0.8169
    best: 0.9481 (SPSR)
  • Super-ResolutiononBSD100 - 4x upscaling
    PSNR
    27.86
    best: 28.16 (DRCT-L)
  • Super-ResolutiononBSD100 - 4x upscaling
    SSIM
    0.7457
    best: 0.851 (Edge-informed SR)

Audio2 results

  • 10-shot image generationonCOCO-Stuff-3
    Pixel Accuracy· 2022-11-26
    80.3
    SOTA
    Rethinking Alignment and Uniformity in Unsupervised Semantic SegmentationarXiv:2211.14513
  • 10-shot image generationonCOCO-Stuff-27
    Clustering [Accuracy]· 2022-11-26
    52
    best: 81.1 (DynaSeg - FSF (ResNet-18 FPN))
    Rethinking Alignment and Uniformity in Unsupervised Semantic SegmentationarXiv:2211.14513

Natural Language Processing1 result

  • Visual Question Answering (VQA)onCOCO Visual Question Answering (VQA) real images 1.0 open ended
    Percentage correct· 2015-11-07
    58.9
    best: 66.5 (MCB 7 att.)
    SOTA
    Stacked Attention Networks for Image Question AnsweringarXiv:1511.02274