Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/SAN

SAN

Reported on 101 benchmarks across 20 tasks · 6 papers · 30 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision49 results

Zero Shot SegmentationonSegmentation in the Wild
Mean AP· 2023-02-23
41.4
best: 49.6 (Grounded HQ-SAM)
SOTA
Side Adapter Network for Open-Vocabulary Semantic Segmentation arXiv:2302.12242
Open Vocabulary Semantic SegmentationonADE20K-847
mIoU· 2023-02-23
13.7
best: 17.3 (UMG-CLIP-E/14)
SOTA
Side Adapter Network for Open-Vocabulary Semantic Segmentation arXiv:2302.12242
Unsupervised Semantic SegmentationonCOCO-Stuff-3
Pixel Accuracy· 2022-11-26
80.3
SOTA
Rethinking Alignment and Uniformity in Unsupervised Semantic Segmentation arXiv:2211.14513
Handwritten Mathmatical Expression RecognitiononCROHME 2016
ExpRate· 2022-03-03
53.6
best: 77.94 (Uni-MuMER)
SOTA
Syntax-Aware Network for Handwritten Mathematical Expression Recognition arXiv:2203.01601
Handwritten Mathmatical Expression RecognitiononHME100K
ExpRate· 2022-03-03
67.1
best: 69.51 (PosFormer)
SOTA
Syntax-Aware Network for Handwritten Mathematical Expression Recognition arXiv:2203.01601
Facial Landmark DetectiononAFLW-Front
Mean NME · 2018-03-12
1.85
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Facial Landmark DetectiononAFLW-Full
Mean NME · 2018-03-12
1.91
best: 2.17 (DCFE (Box height Norm, 19 landmarks - no earlobs))
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Face ReconstructiononAFLW-19
AUC_box@0.07 (%, Full)· 2018-03-12
54
best: 81.8 (FiFA)
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Face ReconstructiononAFLW-19
NME_box (%, Full)· 2018-03-12
4.04
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Face ReconstructiononAFLW-19
NME_diag (%, Frontal)· 2018-03-12
1.85
best: 2.68 (CFSS)
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Face ReconstructiononAFLW-Front
Mean NME · 2018-03-12
1.85
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face ReconstructiononAFLW-19
AUC_box@0.07 (%, Full)· 2018-03-12
54
best: 81.8 (FiFA)
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face ReconstructiononAFLW-19
NME_box (%, Full)· 2018-03-12
4.04
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face ReconstructiononAFLW-19
NME_diag (%, Frontal)· 2018-03-12
1.85
best: 2.68 (CFSS)
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face ReconstructiononAFLW-Front
Mean NME · 2018-03-12
1.85
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Universal Domain AdaptationonDomainNet
H-Score· 2022-11-26
52
best: 73.28 (TASC)
Rethinking Alignment and Uniformity in Unsupervised Semantic Segmentation arXiv:2211.14513
Unsupervised Semantic SegmentationonCOCO-Stuff-27
Clustering [Accuracy]· 2022-11-26
52
best: 81.1 (DynaSeg - FSF (ResNet-18 FPN))
Rethinking Alignment and Uniformity in Unsupervised Semantic Segmentation arXiv:2211.14513
Handwritten Mathmatical Expression RecognitiononCROHME 2019
ExpRate· 2022-03-03
53.5
best: 79.23 (Uni-MuMER)
Syntax-Aware Network for Handwritten Mathematical Expression Recognition arXiv:2203.01601
Handwritten Mathmatical Expression RecognitiononCROHME 2014
ExpRate· 2022-03-03
56.2
best: 82.05 (Uni-MuMER)
Syntax-Aware Network for Handwritten Mathematical Expression Recognition arXiv:2203.01601
Sign Language RecognitiononRWTH-PHOENIX-Weather 2014
Word Error Rate (WER)· 2021-01-12
29.7
best: 18.3 (SlowFastSign)
Context Matters: Self-Attention for Sign Language Recognition arXiv:2101.04632
Face Reconstructionon300W
NME_inter-ocular (%, Challenge)· 2018-03-12
6.6
best: 8.2 (ASMNet)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Face Reconstructionon300W
NME_inter-ocular (%, Common)· 2018-03-12
3.34
best: 5.09 (3DDFA)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Face Reconstructionon300W
NME_inter-ocular (%, Full)· 2018-03-12
3.98
best: 5.63 (3DDFA)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Face ReconstructiononAFLW-19
NME_diag (%, Full)· 2018-03-12
1.91
best: 3.92 (CFSS)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Face ReconstructiononAFLW-Full
Mean NME · 2018-03-12
1.91
best: 2.85 (Binary Face Alignment)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face ReconstructiononAFLW-19
NME_diag (%, Full)· 2018-03-12
1.91
best: 3.92 (CFSS)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face Reconstructionon300W
NME_inter-ocular (%, Challenge)· 2018-03-12
6.6
best: 8.2 (ASMNet)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face Reconstructionon300W
NME_inter-ocular (%, Common)· 2018-03-12
3.34
best: 5.09 (3DDFA)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face Reconstructionon300W
NME_inter-ocular (%, Full)· 2018-03-12
3.98
best: 5.63 (3DDFA)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face ReconstructiononAFLW-Full
Mean NME · 2018-03-12
1.91
best: 2.85 (Binary Face Alignment)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Image Super-ResolutiononSet14 - 4x upscaling
PSNR
29.05
best: 29.54 (DRCT-L)
Image Super-ResolutiononSet14 - 4x upscaling
SSIM
0.7921
best: 0.894 (Edge-informed SR)
Image Super-ResolutiononManga109 - 4x upscaling
PSNR
31.66
best: 33.19 (HMA†)
Image Super-ResolutiononManga109 - 4x upscaling
SSIM
0.9222
best: 0.9366 (Hi-IR-L)
Image Super-ResolutiononUrban100 - 4x upscaling
PSNR
27.23
best: 28.72 (Hi-IR-L)
Image Super-ResolutiononUrban100 - 4x upscaling
SSIM
0.8169
best: 0.9481 (SPSR)
Image Super-ResolutiononBSD100 - 4x upscaling
PSNR
27.86
best: 28.16 (DRCT-L)
Image Super-ResolutiononBSD100 - 4x upscaling
SSIM
0.7457
best: 0.851 (Edge-informed SR)
Universal Domain AdaptationonOffice-31
H-score
91.8
best: 95.95 (UniAM)
Universal Domain AdaptationonOffice-Home
H-Score
75.9
best: 89.44 (TASC)
Universal Domain AdaptationonVisDA2017
H-score
60.1
best: 90.36 (TASC)
3D Object Super-ResolutiononSet14 - 4x upscaling
PSNR
29.05
best: 29.54 (DRCT-L)
3D Object Super-ResolutiononSet14 - 4x upscaling
SSIM
0.7921
best: 0.894 (Edge-informed SR)
3D Object Super-ResolutiononManga109 - 4x upscaling
PSNR
31.66
best: 33.19 (HMA†)
3D Object Super-ResolutiononManga109 - 4x upscaling
SSIM
0.9222
best: 0.9366 (Hi-IR-L)
3D Object Super-ResolutiononUrban100 - 4x upscaling
PSNR
27.23
best: 28.72 (Hi-IR-L)
3D Object Super-ResolutiononUrban100 - 4x upscaling
SSIM
0.8169
best: 0.9481 (SPSR)
3D Object Super-ResolutiononBSD100 - 4x upscaling
PSNR
27.86
best: 28.16 (DRCT-L)
3D Object Super-ResolutiononBSD100 - 4x upscaling
SSIM
0.7457
best: 0.851 (Edge-informed SR)

Methodology21 results

3DonAFLW-19
AUC_box@0.07 (%, Full)· 2018-03-12
54
best: 81.8 (FiFA)
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3DonAFLW-19
NME_box (%, Full)· 2018-03-12
4.04
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3DonAFLW-19
NME_diag (%, Frontal)· 2018-03-12
1.85
best: 2.68 (CFSS)
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3DonAFLW-Front
Mean NME · 2018-03-12
1.85
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Domain AdaptationonDomainNet
H-Score· 2022-11-26
52
best: 73.28 (TASC)
Rethinking Alignment and Uniformity in Unsupervised Semantic Segmentation arXiv:2211.14513
3Don300W
NME_inter-ocular (%, Challenge)· 2018-03-12
6.6
best: 8.2 (ASMNet)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3Don300W
NME_inter-ocular (%, Common)· 2018-03-12
3.34
best: 5.09 (3DDFA)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3Don300W
NME_inter-ocular (%, Full)· 2018-03-12
3.98
best: 5.63 (3DDFA)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3DonAFLW-19
NME_diag (%, Full)· 2018-03-12
1.91
best: 3.92 (CFSS)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3DonAFLW-Full
Mean NME · 2018-03-12
1.91
best: 2.85 (Binary Face Alignment)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Domain AdaptationonOffice-31
H-score
91.8
best: 95.95 (UniAM)
Domain AdaptationonOffice-Home
H-Score
75.9
best: 89.44 (TASC)
Domain AdaptationonVisDA2017
H-score
60.1
best: 90.36 (TASC)
16konSet14 - 4x upscaling
PSNR
29.05
best: 29.54 (DRCT-L)
16konSet14 - 4x upscaling
SSIM
0.7921
best: 0.894 (Edge-informed SR)
16konManga109 - 4x upscaling
PSNR
31.66
best: 33.19 (HMA†)
16konManga109 - 4x upscaling
SSIM
0.9222
best: 0.9366 (Hi-IR-L)
16konUrban100 - 4x upscaling
PSNR
27.23
best: 28.72 (Hi-IR-L)
16konUrban100 - 4x upscaling
SSIM
0.8169
best: 0.9481 (SPSR)
16konBSD100 - 4x upscaling
PSNR
27.86
best: 28.16 (DRCT-L)
16konBSD100 - 4x upscaling
SSIM
0.7457
best: 0.851 (Edge-informed SR)

Medical11 results

Semantic SegmentationonCOCO-Stuff-3
Pixel Accuracy· 2022-11-26
80.3
SOTA
Rethinking Alignment and Uniformity in Unsupervised Semantic Segmentation arXiv:2211.14513
3D Face ModellingonAFLW-19
AUC_box@0.07 (%, Full)· 2018-03-12
54
best: 81.8 (FiFA)
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face ModellingonAFLW-19
NME_box (%, Full)· 2018-03-12
4.04
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face ModellingonAFLW-19
NME_diag (%, Frontal)· 2018-03-12
1.85
best: 2.68 (CFSS)
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face ModellingonAFLW-Front
Mean NME · 2018-03-12
1.85
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Semantic SegmentationonCOCO-Stuff-27
Clustering [Accuracy]· 2022-11-26
52
best: 81.1 (DynaSeg - FSF (ResNet-18 FPN))
Rethinking Alignment and Uniformity in Unsupervised Semantic Segmentation arXiv:2211.14513
3D Face ModellingonAFLW-19
NME_diag (%, Full)· 2018-03-12
1.91
best: 3.92 (CFSS)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face Modellingon300W
NME_inter-ocular (%, Challenge)· 2018-03-12
6.6
best: 8.2 (ASMNet)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face Modellingon300W
NME_inter-ocular (%, Common)· 2018-03-12
3.34
best: 5.09 (3DDFA)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face Modellingon300W
NME_inter-ocular (%, Full)· 2018-03-12
3.98
best: 5.63 (3DDFA)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
3D Face ModellingonAFLW-Full
Mean NME · 2018-03-12
1.91
best: 2.85 (Binary Face Alignment)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108

Music9 results

Facial Recognition and ModellingonAFLW-19
AUC_box@0.07 (%, Full)· 2018-03-12
54
best: 81.8 (FiFA)
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Facial Recognition and ModellingonAFLW-19
NME_box (%, Full)· 2018-03-12
4.04
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Facial Recognition and ModellingonAFLW-19
NME_diag (%, Frontal)· 2018-03-12
1.85
best: 2.68 (CFSS)
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Facial Recognition and ModellingonAFLW-Front
Mean NME · 2018-03-12
1.85
SOTA
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Facial Recognition and ModellingonAFLW-19
NME_diag (%, Full)· 2018-03-12
1.91
best: 3.92 (CFSS)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Facial Recognition and Modellingon300W
NME_inter-ocular (%, Challenge)· 2018-03-12
6.6
best: 8.2 (ASMNet)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Facial Recognition and Modellingon300W
NME_inter-ocular (%, Common)· 2018-03-12
3.34
best: 5.09 (3DDFA)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Facial Recognition and Modellingon300W
NME_inter-ocular (%, Full)· 2018-03-12
3.98
best: 5.63 (3DDFA)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108
Facial Recognition and ModellingonAFLW-Full
Mean NME · 2018-03-12
1.91
best: 2.85 (Binary Face Alignment)
Style Aggregated Network for Facial Landmark Detection arXiv:1803.04108

Graphs8 results

Super-ResolutiononSet14 - 4x upscaling
PSNR
29.05
best: 29.54 (DRCT-L)
Super-ResolutiononSet14 - 4x upscaling
SSIM
0.7921
best: 0.894 (Edge-informed SR)
Super-ResolutiononManga109 - 4x upscaling
PSNR
31.66
best: 33.19 (HMA†)
Super-ResolutiononManga109 - 4x upscaling
SSIM
0.9222
best: 0.9366 (Hi-IR-L)
Super-ResolutiononUrban100 - 4x upscaling
PSNR
27.23
best: 28.72 (Hi-IR-L)
Super-ResolutiononUrban100 - 4x upscaling
SSIM
0.8169
best: 0.9481 (SPSR)
Super-ResolutiononBSD100 - 4x upscaling
PSNR
27.86
best: 28.16 (DRCT-L)
Super-ResolutiononBSD100 - 4x upscaling
SSIM
0.7457
best: 0.851 (Edge-informed SR)

Audio2 results

10-shot image generationonCOCO-Stuff-3
Pixel Accuracy· 2022-11-26
80.3
SOTA
Rethinking Alignment and Uniformity in Unsupervised Semantic Segmentation arXiv:2211.14513
10-shot image generationonCOCO-Stuff-27
Clustering [Accuracy]· 2022-11-26
52
best: 81.1 (DynaSeg - FSF (ResNet-18 FPN))
Rethinking Alignment and Uniformity in Unsupervised Semantic Segmentation arXiv:2211.14513

Natural Language Processing1 result

Visual Question Answering (VQA)onCOCO Visual Question Answering (VQA) real images 1.0 open ended
Percentage correct· 2015-11-07
58.9
best: 66.5 (MCB 7 att.)
SOTA
Stacked Attention Networks for Image Question Answering arXiv:1511.02274