Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/Transformer

Transformer

Reported on 208 benchmarks across 26 tasks · 16 papers · 74 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Time Series83 results

Trajectory PredictiononVi-Fi Multi-modal Dataset
MSE-D· 2024-04-02
14.26
best: 13.42 (OOSTraj)
SOTA
OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising arXiv:2404.02227
Trajectory PredictiononVi-Fi Multi-modal Dataset
MSE-P· 2024-04-02
14.08
best: 13.83 (OOSTraj)
SOTA
OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising arXiv:2404.02227
Trajectory PredictiononVi-Fi Multi-modal Dataset
SUM· 2024-04-02
28.33
best: 200.9 (ViTag)
OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising arXiv:2404.02227
Time Series ForecastingonETTh2 (336) Univariate
MAE· 2021-07-19
0.3805
best: 0.323 (Informer)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (336) Univariate
MSE· 2021-07-19
0.2191
best: 0.166 (PatchMixer)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (168) Univariate
MAE· 2021-07-19
0.3547
best: 0.306 (Informer)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (168) Univariate
MSE· 2021-07-19
0.1974
best: 0.154 (Informer)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (24) Multivariate
MAE· uses extra data· 2021-07-19
0.4788
best: 0.342 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (24) Multivariate
MSE· uses extra data· 2021-07-19
0.4496
best: 0.3 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (168) Multivariate
MAE· 2021-07-19
0.9726
best: 0.38 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (168) Multivariate
MSE· 2021-07-19
1.6225
best: 0.342 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (24) Univariate
MAE· 2021-07-19
0.183
best: 0.127 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (24) Univariate
MSE· 2021-07-19
0.0548
best: 0.029 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (720) Multivariate
MAE· 2021-07-19
1.3668
best: 0.418 (xPatch)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (720) Multivariate
MSE· 2021-07-19
3.1805
best: 0.372 (RLinear)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (720) Multivariate
MAE· 2021-07-19
0.8399
best: 0.447 (SegRNN)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (720) Multivariate
MSE· 2021-07-19
1.108
best: 0.409 (DiPE-Linear)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (336) Multivariate
MAE· 2021-07-19
1.2189
best: 0.36 (xPatch)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (336) Multivariate
MSE· 2021-07-19
2.6617
best: 0.312 (xPatch)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (720) Univariate
MAE· 2021-07-19
0.4213
best: 0.223 (AutoCon)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (720) Univariate
MSE· 2021-07-19
0.2501
best: 0.078 (AutoCon)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (24) Multivariate
MAE· 2021-07-19
0.5013
best: 0.263 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (24) Multivariate
MSE· 2021-07-19
0.4237
best: 0.18 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (24) Univariate
MAE· 2021-07-19
0.2479
best: 0.183 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (24) Univariate
MSE· 2021-07-19
0.0999
best: 0.065 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (336) Multivariate
MAE· 2021-07-19
0.7041
best: 0.2158 (DeformTime)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (336) Multivariate
MSE· 2021-07-19
0.8321
best: 0.374 (D-PAD)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (168) Multivariate
MAE· 2021-07-19
0.6325
best: 0.417 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (168) Multivariate
MSE· 2021-07-19
0.7146
best: 0.408 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (168) Univariate
MAE· 2021-07-19
0.2539
best: 0.21 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (168) Univariate
MSE· 2021-07-19
0.1049
best: 0.071 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (48) Multivariate
MAE· 2021-07-19
0.4968
best: 0.388 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (48) Multivariate
MSE· 2021-07-19
0.4668
best: 0.361 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (720) Univariate
MAE· 2021-07-19
0.434
best: 0.338 (Informer)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (720) Univariate
MSE· 2021-07-19
0.2853
best: 0.177 (AutoCon)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (48) Univariate
MAE· 2021-07-19
0.2144
best: 0.154 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (48) Univariate
MSE· 2021-07-19
0.074
best: 0.041 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (336) Univariate
MAE· 2021-07-19
0.3201
best: 0.215 (SegRNN)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh1 (336) Univariate
MSE· 2021-07-19
0.1541
best: 0.073 (SegRNN)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (48) Multivariate
MAE· 2021-07-19
0.9488
best: 0.303 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (48) Multivariate
MSE· 2021-07-19
1.522
best: 0.23 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (48) Univariate
MAE· 2021-07-19
0.2763
best: 0.227 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series ForecastingonETTh2 (48) Univariate
MSE· 2021-07-19
0.1218
best: 0.093 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (336) Univariate
MAE· 2021-07-19
0.3805
best: 0.323 (Informer)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (336) Univariate
MSE· 2021-07-19
0.2191
best: 0.166 (PatchMixer)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (168) Univariate
MAE· 2021-07-19
0.3547
best: 0.306 (Informer)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (168) Univariate
MSE· 2021-07-19
0.1974
best: 0.154 (Informer)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (24) Multivariate
MAE· uses extra data· 2021-07-19
0.4788
best: 0.342 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (24) Multivariate
MSE· uses extra data· 2021-07-19
0.4496
best: 0.3 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (168) Multivariate
MAE· 2021-07-19
0.9726
best: 0.38 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (168) Multivariate
MSE· 2021-07-19
1.6225
best: 0.342 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (24) Univariate
MAE· 2021-07-19
0.183
best: 0.127 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (24) Univariate
MSE· 2021-07-19
0.0548
best: 0.029 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (720) Multivariate
MAE· 2021-07-19
1.3668
best: 0.418 (xPatch)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (720) Multivariate
MSE· 2021-07-19
3.1805
best: 0.372 (RLinear)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (720) Multivariate
MAE· 2021-07-19
0.8399
best: 0.447 (SegRNN)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (720) Multivariate
MSE· 2021-07-19
1.108
best: 0.409 (DiPE-Linear)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (336) Multivariate
MAE· 2021-07-19
1.2189
best: 0.36 (xPatch)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (336) Multivariate
MSE· 2021-07-19
2.6617
best: 0.312 (xPatch)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (720) Univariate
MAE· 2021-07-19
0.4213
best: 0.223 (AutoCon)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (720) Univariate
MSE· 2021-07-19
0.2501
best: 0.078 (AutoCon)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (24) Multivariate
MAE· 2021-07-19
0.5013
best: 0.263 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (24) Multivariate
MSE· 2021-07-19
0.4237
best: 0.18 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (24) Univariate
MAE· 2021-07-19
0.2479
best: 0.183 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (24) Univariate
MSE· 2021-07-19
0.0999
best: 0.065 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (336) Multivariate
MAE· 2021-07-19
0.7041
best: 0.2158 (DeformTime)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (336) Multivariate
MSE· 2021-07-19
0.8321
best: 0.374 (D-PAD)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (168) Multivariate
MAE· 2021-07-19
0.6325
best: 0.417 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (168) Multivariate
MSE· 2021-07-19
0.7146
best: 0.408 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (168) Univariate
MAE· 2021-07-19
0.2539
best: 0.21 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (168) Univariate
MSE· 2021-07-19
0.1049
best: 0.071 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (48) Multivariate
MAE· 2021-07-19
0.4968
best: 0.388 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (48) Multivariate
MSE· 2021-07-19
0.4668
best: 0.361 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (720) Univariate
MAE· 2021-07-19
0.434
best: 0.338 (Informer)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (720) Univariate
MSE· 2021-07-19
0.2853
best: 0.177 (AutoCon)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (48) Univariate
MAE· 2021-07-19
0.2144
best: 0.154 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (48) Univariate
MSE· 2021-07-19
0.074
best: 0.041 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (336) Univariate
MAE· 2021-07-19
0.3201
best: 0.215 (SegRNN)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh1 (336) Univariate
MSE· 2021-07-19
0.1541
best: 0.073 (SegRNN)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (48) Multivariate
MAE· 2021-07-19
0.9488
best: 0.303 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (48) Multivariate
MSE· 2021-07-19
1.522
best: 0.23 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (48) Univariate
MAE· 2021-07-19
0.2763
best: 0.227 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687
Time Series AnalysisonETTh2 (48) Univariate
MSE· 2021-07-19
0.1218
best: 0.093 (SCINet)
Long-term series forecasting with Query Selector -- efficient model of sparse attention arXiv:2107.08687

Natural Language Processing71 results

Source Code SummarizationonCoDesc
BLEU-4· 2021-05-29
45.89
SOTA
CoDesc: A Large Code-Description Parallel Dataset arXiv:2105.14220
Code GenerationonCodeSearchNet - JavaScript
Smoothed BLEU-4· 2020-02-19
25.61
SOTA
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Grammatical Error CorrectiononFalko-MERLIN
F0.5· uses extra data· 2019-10-01
73.71
best: 76.75 (Llama + 1M BT + gold)
SOTA
Grammatical Error Correction in Low-Resource Scenarios arXiv:1910.00353
Grammatical Error CorrectiononRestricted
F0.5· 2018-04-16
55.8
best: 56.52 (CNN Seq2Seq + Quality Estimation)
SOTA
Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task arXiv:1804.05940
Grammatical Error Correctionon_Restricted_
GLEU· 2018-04-16
59.9
SOTA
Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task arXiv:1804.05940
Machine TranslationonIWSLT2015 English-German
BLEU score· 2017-06-12
28.5
best: 30 (PS-KD)
SOTA
Attention Is All You Need arXiv:1706.03762
Machine TranslationonIWSLT2014 German-English
BLEU score· 2017-06-12
34.44
best: 40.43 (PiNMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Machine TranslationonMulti30K
BLUE (DE-EN)· 2017-06-12
29
best: 32.3 (PS-KD)
SOTA
Attention Is All You Need arXiv:1706.03762
Question AnsweringonMathematics Dataset
Accuracy· 2017-06-12
0.76
best: 0.8192 (TP-Transformer)
SOTA
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonLSMDC-E
BLEU-1· 2017-06-12
15.35
best: 18.52 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonLSMDC-E
BLEU-3· 2017-06-12
1.82
best: 2.51 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonLSMDC-E
BLEU-4· 2017-06-12
0.76
best: 1.13 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonLSMDC-E
CIDEr· 2017-06-12
9.32
best: 12.41 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonLSMDC-E
METEOR· 2017-06-12
11.43
best: 12.87 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonVIST-E
BLEU-1· 2017-06-12
17.18
best: 22.87 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonVIST-E
BLEU-2· 2017-06-12
6.29
best: 8.68 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonVIST-E
BLEU-3· 2017-06-12
3.07
best: 4.38 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonVIST-E
BLEU-4· 2017-06-12
2.01
best: 2.61 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonVIST-E
CIDEr· 2017-06-12
12.75
best: 25.41 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonVIST-E
METEOR· 2017-06-12
6.91
best: 15.55 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonVIST-E
ROUGE-L· 2017-06-12
18.23
best: 23.61 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Multimodal Machine TranslationonMulti30K
BLUE (DE-EN)· 2017-06-12
29
best: 32.3 (PS-KD)
SOTA
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonLSMDC-E
BLEU-1· 2017-06-12
15.35
best: 18.52 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonLSMDC-E
BLEU-3· 2017-06-12
1.82
best: 2.51 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonLSMDC-E
BLEU-4· 2017-06-12
0.76
best: 1.13 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonLSMDC-E
CIDEr· 2017-06-12
9.32
best: 12.41 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonLSMDC-E
METEOR· 2017-06-12
11.43
best: 12.87 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonVIST-E
BLEU-1· 2017-06-12
17.18
best: 22.87 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonVIST-E
BLEU-2· 2017-06-12
6.29
best: 8.68 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonVIST-E
BLEU-3· 2017-06-12
3.07
best: 4.38 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonVIST-E
BLEU-4· 2017-06-12
2.01
best: 2.61 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonVIST-E
CIDEr· 2017-06-12
12.75
best: 25.41 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonVIST-E
METEOR· 2017-06-12
6.91
best: 15.55 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonVIST-E
ROUGE-L· 2017-06-12
18.23
best: 23.61 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Story GenerationonLSMDC-E
BLEU-1· 2017-06-12
15.35
best: 18.52 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Story GenerationonLSMDC-E
BLEU-3· 2017-06-12
1.82
best: 2.51 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Story GenerationonLSMDC-E
BLEU-4· 2017-06-12
0.76
best: 1.13 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Story GenerationonLSMDC-E
CIDEr· 2017-06-12
9.32
best: 12.41 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Story GenerationonLSMDC-E
METEOR· 2017-06-12
11.43
best: 12.87 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Story GenerationonVIST-E
BLEU-1· 2017-06-12
17.18
best: 22.87 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Story GenerationonVIST-E
BLEU-2· 2017-06-12
6.29
best: 8.68 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Story GenerationonVIST-E
BLEU-3· 2017-06-12
3.07
best: 4.38 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Story GenerationonVIST-E
BLEU-4· 2017-06-12
2.01
best: 2.61 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Story GenerationonVIST-E
CIDEr· 2017-06-12
12.75
best: 25.41 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Story GenerationonVIST-E
METEOR· 2017-06-12
6.91
best: 15.55 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Story GenerationonVIST-E
ROUGE-L· 2017-06-12
18.23
best: 23.61 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Machine TranslationonIWSLT2014 German-English
BLEU score· 2022-05-15
35.1385
best: 40.43 (PiNMT)
Guidelines for the Regularization of Gammas in Batch Normalization for Deep Residual Networks arXiv:2205.07260
Code GenerationonCodeSearchNet
Smoothed BLEU-4· 2020-02-19
14.31
best: 15.99 (CodeBERT (MLM+RTD))
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Code GenerationonCodeSearchNet - Python
Smoothed BLEU-4· 2020-02-19
13.44
best: 20.39 (CodeTrans-MT-Base)
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Code GenerationonCodeSearchNet - Php
Smoothed BLEU-4· 2020-02-19
18.25
best: 26.23 (CodeTrans-MT-Base)
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Code GenerationonCodeSearchNet - Java
Smoothed BLEU-4· 2020-02-19
12.57
best: 21.87 (CodeTrans-MT-Large)
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Code GenerationonCodeSearchNet - Ruby
Smoothed BLEU-4· 2020-02-19
7.87
best: 15.26 (CodeTrans-MT-Base)
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Machine TranslationonWMT2016 English-German
BLEU score· 2019-10-09
26.7
best: 40.68 (MADL)
On the adequacy of untuned warmup for adaptive optimization arXiv:1910.04209
Grammatical Error CorrectiononBEA-2019 (test)
F0.5· 2019-07-02
69
best: 81.4 (Majority-voting ensemble on best 7 models)
A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning arXiv:1907.01256
Machine TranslationonWMT2014 English-French
BLEU score· 2019-01-30
40.5
best: 46.4 (Transformer+BT (ADMIN init))
Memory-Efficient Adaptive Optimization arXiv:1901.11150
Grammatical Error CorrectiononCoNLL-2014 Shared Task
F0.5· 2018-04-16
55.8
best: 72.8 (Ensembles of best 7 models + GRECO + GTP-rerank)
Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task arXiv:1804.05940
Grammatical Error CorrectiononJFLEG
GLEU· 2018-04-16
59.9
best: 62.1 (VERNet)
Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task arXiv:1804.05940
Constituency ParsingonPenn Treebank
F1 score· 2017-06-12
92.7
best: 96.43 (Hashing + XLNet)
Attention Is All You Need arXiv:1706.03762
Abstractive Text SummarizationonCNN / Daily Mail
ROUGE-1· 2017-06-12
39.5
best: 48.18 (Scrambled code + broken (alter))
Attention Is All You Need arXiv:1706.03762
Abstractive Text SummarizationonCNN / Daily Mail
ROUGE-2· 2017-06-12
16.06
best: 24.02 (Pegasus)
Attention Is All You Need arXiv:1706.03762
Abstractive Text SummarizationonCNN / Daily Mail
ROUGE-L· 2017-06-12
36.63
best: 45.35 (Scrambled code + broken (alter))
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonLSMDC-E
BLEU-2· 2017-06-12
4.49
best: 5.99 (MMT)
Attention Is All You Need arXiv:1706.03762
Data-to-Text GenerationonLSMDC-E
ROUGE-L· 2017-06-12
19.16
best: 20.99 (MMT)
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonLSMDC-E
BLEU-2· 2017-06-12
4.49
best: 5.99 (MMT)
Attention Is All You Need arXiv:1706.03762
Visual StorytellingonLSMDC-E
ROUGE-L· 2017-06-12
19.16
best: 20.99 (MMT)
Attention Is All You Need arXiv:1706.03762
Story GenerationonLSMDC-E
BLEU-2· 2017-06-12
4.49
best: 5.99 (MMT)
Attention Is All You Need arXiv:1706.03762
Story GenerationonLSMDC-E
ROUGE-L· 2017-06-12
19.16
best: 20.99 (MMT)
Attention Is All You Need arXiv:1706.03762
Grammatical Error CorrectiononBEA-2019 (test)
F0.5
69.5
best: 81.4 (Majority-voting ensemble on best 7 models)
Abstractive Text Summarizationonvietnews
Rouge-1
57.56
best: 67.8 (Kết quả nghiên cứu)
Abstractive Text Summarizationonvietnews
Rouge-2
24.25
best: 34.24 (ViT5 large)
Abstractive Text Summarizationonvietnews
Rouge-L
35.53
best: 43.55 (ViT5 large)

Adversarial20 results

Text GenerationonCodeSearchNet - JavaScript
Smoothed BLEU-4· 2020-02-19
25.61
SOTA
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Text GenerationonLSMDC-E
BLEU-1· 2017-06-12
15.35
best: 18.52 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Text GenerationonLSMDC-E
BLEU-3· 2017-06-12
1.82
best: 2.51 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Text GenerationonLSMDC-E
BLEU-4· 2017-06-12
0.76
best: 1.13 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Text GenerationonLSMDC-E
CIDEr· 2017-06-12
9.32
best: 12.41 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Text GenerationonLSMDC-E
METEOR· 2017-06-12
11.43
best: 12.87 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Text GenerationonVIST-E
BLEU-1· 2017-06-12
17.18
best: 22.87 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Text GenerationonVIST-E
BLEU-2· 2017-06-12
6.29
best: 8.68 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Text GenerationonVIST-E
BLEU-3· 2017-06-12
3.07
best: 4.38 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Text GenerationonVIST-E
BLEU-4· 2017-06-12
2.01
best: 2.61 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Text GenerationonVIST-E
CIDEr· 2017-06-12
12.75
best: 25.41 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Text GenerationonVIST-E
METEOR· 2017-06-12
6.91
best: 15.55 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Text GenerationonVIST-E
ROUGE-L· 2017-06-12
18.23
best: 23.61 (MMT)
SOTA
Attention Is All You Need arXiv:1706.03762
Text GenerationonCodeSearchNet
Smoothed BLEU-4· 2020-02-19
14.31
best: 15.99 (CodeBERT (MLM+RTD))
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Text GenerationonCodeSearchNet - Python
Smoothed BLEU-4· 2020-02-19
13.44
best: 20.39 (CodeTrans-MT-Base)
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Text GenerationonCodeSearchNet - Php
Smoothed BLEU-4· 2020-02-19
18.25
best: 26.23 (CodeTrans-MT-Base)
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Text GenerationonCodeSearchNet - Java
Smoothed BLEU-4· 2020-02-19
12.57
best: 21.87 (CodeTrans-MT-Large)
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Text GenerationonCodeSearchNet - Ruby
Smoothed BLEU-4· 2020-02-19
7.87
best: 15.26 (CodeTrans-MT-Base)
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Text GenerationonLSMDC-E
BLEU-2· 2017-06-12
4.49
best: 5.99 (MMT)
Attention Is All You Need arXiv:1706.03762
Text GenerationonLSMDC-E
ROUGE-L· 2017-06-12
19.16
best: 20.99 (MMT)
Attention Is All You Need arXiv:1706.03762

Knowledge Base9 results

Text SummarizationonGigaWord
ROUGE-1· 2017-06-12
37.57
best: 60.12 (OpenAI/o3-mini)
SOTA
Attention Is All You Need arXiv:1706.03762
Text SummarizationonGigaWord
ROUGE-2· 2017-06-12
18.9
best: 54.22 (OpenAI/o3-mini)
SOTA
Attention Is All You Need arXiv:1706.03762
Text SummarizationonGigaWord
ROUGE-L· 2017-06-12
34.69
best: 60.29 (Riple/Saanvi-v0.1)
SOTA
Attention Is All You Need arXiv:1706.03762
Text SummarizationonCNN / Daily Mail
ROUGE-1· 2017-06-12
39.5
best: 48.18 (Scrambled code + broken (alter))
Attention Is All You Need arXiv:1706.03762
Text SummarizationonCNN / Daily Mail
ROUGE-2· 2017-06-12
16.06
best: 24.02 (Pegasus)
Attention Is All You Need arXiv:1706.03762
Text SummarizationonCNN / Daily Mail
ROUGE-L· 2017-06-12
36.63
best: 45.35 (Scrambled code + broken (alter))
Attention Is All You Need arXiv:1706.03762
Text Summarizationonvietnews
Rouge-1
57.56
best: 67.8 (Kết quả nghiên cứu)
Text Summarizationonvietnews
Rouge-2
24.25
best: 34.24 (ViT5 large)
Text Summarizationonvietnews
Rouge-L
35.53
best: 43.55 (ViT5 large)

Computer Vision9 results

Shape Representation Of 3D Point CloudsonScanObjectNN
GFLOPs· 2017-06-12
4.8
best: 45 (PCM)
SOTA
Attention Is All You Need arXiv:1706.03762
Shape Representation Of 3D Point CloudsonScanObjectNN
Number of params (M)· 2017-06-12
22.1
best: 34.2 (PCM)
SOTA
Attention Is All You Need arXiv:1706.03762
3D Point Cloud ClassificationonScanObjectNN
GFLOPs· 2017-06-12
4.8
best: 45 (PCM)
SOTA
Attention Is All You Need arXiv:1706.03762
3D Point Cloud ClassificationonScanObjectNN
Number of params (M)· 2017-06-12
22.1
best: 34.2 (PCM)
SOTA
Attention Is All You Need arXiv:1706.03762
3D Point Cloud ReconstructiononScanObjectNN
GFLOPs· 2017-06-12
4.8
best: 45 (PCM)
SOTA
Attention Is All You Need arXiv:1706.03762
3D Point Cloud ReconstructiononScanObjectNN
Number of params (M)· 2017-06-12
22.1
best: 34.2 (PCM)
SOTA
Attention Is All You Need arXiv:1706.03762
Shape Representation Of 3D Point CloudsonScanObjectNN
Overall Accuracy (PB_T50_RS)· 2017-06-12
77.24
best: 92.64 (Mamba3D)
Attention Is All You Need arXiv:1706.03762
3D Point Cloud ClassificationonScanObjectNN
Overall Accuracy (PB_T50_RS)· 2017-06-12
77.24
best: 92.64 (Mamba3D)
Attention Is All You Need arXiv:1706.03762
3D Point Cloud ReconstructiononScanObjectNN
Overall Accuracy (PB_T50_RS)· 2017-06-12
77.24
best: 92.64 (Mamba3D)
Attention Is All You Need arXiv:1706.03762

Computer Code8 results

Code Documentation GenerationonCodeSearchNet - JavaScript
Smoothed BLEU-4· 2020-02-19
25.61
SOTA
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Program SynthesisonGitHub-Python
Accuracy (%)· 2021-06-11
62
best: 90.5 (Transformer + BIFI)
Break-It-Fix-It: Unsupervised Learning for Program Repair arXiv:2106.06600
Program RepaironGitHub-Python
Accuracy (%)· 2021-06-11
62
best: 90.5 (Transformer + BIFI)
Break-It-Fix-It: Unsupervised Learning for Program Repair arXiv:2106.06600
Code Documentation GenerationonCodeSearchNet
Smoothed BLEU-4· 2020-02-19
14.31
best: 15.99 (CodeBERT (MLM+RTD))
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Code Documentation GenerationonCodeSearchNet - Python
Smoothed BLEU-4· 2020-02-19
13.44
best: 20.39 (CodeTrans-MT-Base)
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Code Documentation GenerationonCodeSearchNet - Php
Smoothed BLEU-4· 2020-02-19
18.25
best: 26.23 (CodeTrans-MT-Base)
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Code Documentation GenerationonCodeSearchNet - Java
Smoothed BLEU-4· 2020-02-19
12.57
best: 21.87 (CodeTrans-MT-Large)
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155
Code Documentation GenerationonCodeSearchNet - Ruby
Smoothed BLEU-4· 2020-02-19
7.87
best: 15.26 (CodeTrans-MT-Base)
CodeBERT: A Pre-Trained Model for Programming and Natural Languages arXiv:2002.08155

Medical6 results

Language ModellingonLRA
Avg· 2020-11-08
54.39
best: 87.46 (S5)
SOTA
Long Range Arena: A Benchmark for Efficient Transformers arXiv:2011.04006
Language ModellingonLRA
ListOps· 2020-11-08
36.37
best: 62.15 (S5)
SOTA
Long Range Arena: A Benchmark for Efficient Transformers arXiv:2011.04006
Language ModellingonLRA
Image· 2020-11-08
42.44
best: 88.65 (S4)
Long Range Arena: A Benchmark for Efficient Transformers arXiv:2011.04006
Language ModellingonLRA
Pathfinder· 2020-11-08
71.4
best: 95.33 (S5)
Long Range Arena: A Benchmark for Efficient Transformers arXiv:2011.04006
Language ModellingonLRA
Retrieval· 2020-11-08
57.46
best: 91.4 (S5)
Long Range Arena: A Benchmark for Efficient Transformers arXiv:2011.04006
Language ModellingonLRA
Text· 2020-11-08
64.27
best: 89.31 (S5)
Long Range Arena: A Benchmark for Efficient Transformers arXiv:2011.04006

Audio2 results

Speech RecognitiononLibriSpeech test-clean
Word Error Rate (WER)· uses extra data· 2019-09-13
2.6
best: 0.985 (United Med ASR)
A Comparative Study on Transformer vs RNN in Speech Applications arXiv:1909.06317
Speech RecognitiononLibriSpeech test-other
Word Error Rate (WER)· uses extra data· 2019-09-13
5.7
best: 2.48 (SAMBA ASR)
A Comparative Study on Transformer vs RNN in Speech Applications arXiv:1909.06317

Miscellaneous1 result

Crop Yield Predictionon2018 Syngenta (2016 val)
RMSE· 2019-02-07
9.28
SOTA
Crop Yield Prediction Using Deep Neural Networks arXiv:1902.02860

Music1 result

Music ModelingonNottingham
NLL· 2019-07-12
3.34
best: 4.05 (RNN)
R-Transformer: Recurrent Neural Network Enhanced Transformer arXiv:1907.05572