TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/Transformer

Transformer

Reported on 208 benchmarks across 26 tasks · 16 papers · 74 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Time Series83 results

  • Trajectory PredictiononVi-Fi Multi-modal Dataset
    MSE-D· 2024-04-02
    14.26
    best: 13.42 (OOSTraj)
    SOTA
    OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning DenoisingarXiv:2404.02227
  • Trajectory PredictiononVi-Fi Multi-modal Dataset
    MSE-P· 2024-04-02
    14.08
    best: 13.83 (OOSTraj)
    SOTA
    OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning DenoisingarXiv:2404.02227
  • Trajectory PredictiononVi-Fi Multi-modal Dataset
    SUM· 2024-04-02
    28.33
    best: 200.9 (ViTag)
    OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning DenoisingarXiv:2404.02227
  • Time Series ForecastingonETTh2 (336) Univariate
    MAE· 2021-07-19
    0.3805
    best: 0.323 (Informer)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (336) Univariate
    MSE· 2021-07-19
    0.2191
    best: 0.166 (PatchMixer)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (168) Univariate
    MAE· 2021-07-19
    0.3547
    best: 0.306 (Informer)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (168) Univariate
    MSE· 2021-07-19
    0.1974
    best: 0.154 (Informer)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (24) Multivariate
    MAE· uses extra data· 2021-07-19
    0.4788
    best: 0.342 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (24) Multivariate
    MSE· uses extra data· 2021-07-19
    0.4496
    best: 0.3 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (168) Multivariate
    MAE· 2021-07-19
    0.9726
    best: 0.38 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (168) Multivariate
    MSE· 2021-07-19
    1.6225
    best: 0.342 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (24) Univariate
    MAE· 2021-07-19
    0.183
    best: 0.127 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (24) Univariate
    MSE· 2021-07-19
    0.0548
    best: 0.029 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (720) Multivariate
    MAE· 2021-07-19
    1.3668
    best: 0.418 (xPatch)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (720) Multivariate
    MSE· 2021-07-19
    3.1805
    best: 0.372 (RLinear)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (720) Multivariate
    MAE· 2021-07-19
    0.8399
    best: 0.447 (SegRNN)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (720) Multivariate
    MSE· 2021-07-19
    1.108
    best: 0.409 (DiPE-Linear)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (336) Multivariate
    MAE· 2021-07-19
    1.2189
    best: 0.36 (xPatch)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (336) Multivariate
    MSE· 2021-07-19
    2.6617
    best: 0.312 (xPatch)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (720) Univariate
    MAE· 2021-07-19
    0.4213
    best: 0.223 (AutoCon)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (720) Univariate
    MSE· 2021-07-19
    0.2501
    best: 0.078 (AutoCon)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (24) Multivariate
    MAE· 2021-07-19
    0.5013
    best: 0.263 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (24) Multivariate
    MSE· 2021-07-19
    0.4237
    best: 0.18 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (24) Univariate
    MAE· 2021-07-19
    0.2479
    best: 0.183 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (24) Univariate
    MSE· 2021-07-19
    0.0999
    best: 0.065 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (336) Multivariate
    MAE· 2021-07-19
    0.7041
    best: 0.2158 (DeformTime)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (336) Multivariate
    MSE· 2021-07-19
    0.8321
    best: 0.374 (D-PAD)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (168) Multivariate
    MAE· 2021-07-19
    0.6325
    best: 0.417 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (168) Multivariate
    MSE· 2021-07-19
    0.7146
    best: 0.408 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (168) Univariate
    MAE· 2021-07-19
    0.2539
    best: 0.21 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (168) Univariate
    MSE· 2021-07-19
    0.1049
    best: 0.071 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (48) Multivariate
    MAE· 2021-07-19
    0.4968
    best: 0.388 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (48) Multivariate
    MSE· 2021-07-19
    0.4668
    best: 0.361 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (720) Univariate
    MAE· 2021-07-19
    0.434
    best: 0.338 (Informer)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (720) Univariate
    MSE· 2021-07-19
    0.2853
    best: 0.177 (AutoCon)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (48) Univariate
    MAE· 2021-07-19
    0.2144
    best: 0.154 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (48) Univariate
    MSE· 2021-07-19
    0.074
    best: 0.041 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (336) Univariate
    MAE· 2021-07-19
    0.3201
    best: 0.215 (SegRNN)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh1 (336) Univariate
    MSE· 2021-07-19
    0.1541
    best: 0.073 (SegRNN)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (48) Multivariate
    MAE· 2021-07-19
    0.9488
    best: 0.303 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (48) Multivariate
    MSE· 2021-07-19
    1.522
    best: 0.23 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (48) Univariate
    MAE· 2021-07-19
    0.2763
    best: 0.227 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series ForecastingonETTh2 (48) Univariate
    MSE· 2021-07-19
    0.1218
    best: 0.093 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (336) Univariate
    MAE· 2021-07-19
    0.3805
    best: 0.323 (Informer)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (336) Univariate
    MSE· 2021-07-19
    0.2191
    best: 0.166 (PatchMixer)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (168) Univariate
    MAE· 2021-07-19
    0.3547
    best: 0.306 (Informer)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (168) Univariate
    MSE· 2021-07-19
    0.1974
    best: 0.154 (Informer)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (24) Multivariate
    MAE· uses extra data· 2021-07-19
    0.4788
    best: 0.342 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (24) Multivariate
    MSE· uses extra data· 2021-07-19
    0.4496
    best: 0.3 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (168) Multivariate
    MAE· 2021-07-19
    0.9726
    best: 0.38 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (168) Multivariate
    MSE· 2021-07-19
    1.6225
    best: 0.342 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (24) Univariate
    MAE· 2021-07-19
    0.183
    best: 0.127 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (24) Univariate
    MSE· 2021-07-19
    0.0548
    best: 0.029 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (720) Multivariate
    MAE· 2021-07-19
    1.3668
    best: 0.418 (xPatch)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (720) Multivariate
    MSE· 2021-07-19
    3.1805
    best: 0.372 (RLinear)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (720) Multivariate
    MAE· 2021-07-19
    0.8399
    best: 0.447 (SegRNN)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (720) Multivariate
    MSE· 2021-07-19
    1.108
    best: 0.409 (DiPE-Linear)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (336) Multivariate
    MAE· 2021-07-19
    1.2189
    best: 0.36 (xPatch)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (336) Multivariate
    MSE· 2021-07-19
    2.6617
    best: 0.312 (xPatch)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (720) Univariate
    MAE· 2021-07-19
    0.4213
    best: 0.223 (AutoCon)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (720) Univariate
    MSE· 2021-07-19
    0.2501
    best: 0.078 (AutoCon)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (24) Multivariate
    MAE· 2021-07-19
    0.5013
    best: 0.263 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (24) Multivariate
    MSE· 2021-07-19
    0.4237
    best: 0.18 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (24) Univariate
    MAE· 2021-07-19
    0.2479
    best: 0.183 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (24) Univariate
    MSE· 2021-07-19
    0.0999
    best: 0.065 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (336) Multivariate
    MAE· 2021-07-19
    0.7041
    best: 0.2158 (DeformTime)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (336) Multivariate
    MSE· 2021-07-19
    0.8321
    best: 0.374 (D-PAD)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (168) Multivariate
    MAE· 2021-07-19
    0.6325
    best: 0.417 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (168) Multivariate
    MSE· 2021-07-19
    0.7146
    best: 0.408 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (168) Univariate
    MAE· 2021-07-19
    0.2539
    best: 0.21 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (168) Univariate
    MSE· 2021-07-19
    0.1049
    best: 0.071 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (48) Multivariate
    MAE· 2021-07-19
    0.4968
    best: 0.388 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (48) Multivariate
    MSE· 2021-07-19
    0.4668
    best: 0.361 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (720) Univariate
    MAE· 2021-07-19
    0.434
    best: 0.338 (Informer)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (720) Univariate
    MSE· 2021-07-19
    0.2853
    best: 0.177 (AutoCon)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (48) Univariate
    MAE· 2021-07-19
    0.2144
    best: 0.154 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (48) Univariate
    MSE· 2021-07-19
    0.074
    best: 0.041 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (336) Univariate
    MAE· 2021-07-19
    0.3201
    best: 0.215 (SegRNN)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh1 (336) Univariate
    MSE· 2021-07-19
    0.1541
    best: 0.073 (SegRNN)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (48) Multivariate
    MAE· 2021-07-19
    0.9488
    best: 0.303 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (48) Multivariate
    MSE· 2021-07-19
    1.522
    best: 0.23 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (48) Univariate
    MAE· 2021-07-19
    0.2763
    best: 0.227 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687
  • Time Series AnalysisonETTh2 (48) Univariate
    MSE· 2021-07-19
    0.1218
    best: 0.093 (SCINet)
    Long-term series forecasting with Query Selector -- efficient model of sparse attentionarXiv:2107.08687

Natural Language Processing71 results

  • Source Code SummarizationonCoDesc
    BLEU-4· 2021-05-29
    45.89
    SOTA
    CoDesc: A Large Code-Description Parallel DatasetarXiv:2105.14220
  • Code GenerationonCodeSearchNet - JavaScript
    Smoothed BLEU-4· 2020-02-19
    25.61
    SOTA
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Grammatical Error CorrectiononFalko-MERLIN
    F0.5· uses extra data· 2019-10-01
    73.71
    best: 76.75 (Llama + 1M BT + gold)
    SOTA
    Grammatical Error Correction in Low-Resource ScenariosarXiv:1910.00353
  • Grammatical Error CorrectiononRestricted
    F0.5· 2018-04-16
    55.8
    best: 56.52 (CNN Seq2Seq + Quality Estimation)
    SOTA
    Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation TaskarXiv:1804.05940
  • Grammatical Error Correctionon_Restricted_
    GLEU· 2018-04-16
    59.9
    SOTA
    Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation TaskarXiv:1804.05940
  • Machine TranslationonIWSLT2015 English-German
    BLEU score· 2017-06-12
    28.5
    best: 30 (PS-KD)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Machine TranslationonIWSLT2014 German-English
    BLEU score· 2017-06-12
    34.44
    best: 40.43 (PiNMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Machine TranslationonMulti30K
    BLUE (DE-EN)· 2017-06-12
    29
    best: 32.3 (PS-KD)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Question AnsweringonMathematics Dataset
    Accuracy· 2017-06-12
    0.76
    best: 0.8192 (TP-Transformer)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonLSMDC-E
    BLEU-1· 2017-06-12
    15.35
    best: 18.52 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonLSMDC-E
    BLEU-3· 2017-06-12
    1.82
    best: 2.51 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonLSMDC-E
    BLEU-4· 2017-06-12
    0.76
    best: 1.13 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonLSMDC-E
    CIDEr· 2017-06-12
    9.32
    best: 12.41 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonLSMDC-E
    METEOR· 2017-06-12
    11.43
    best: 12.87 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonVIST-E
    BLEU-1· 2017-06-12
    17.18
    best: 22.87 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonVIST-E
    BLEU-2· 2017-06-12
    6.29
    best: 8.68 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonVIST-E
    BLEU-3· 2017-06-12
    3.07
    best: 4.38 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonVIST-E
    BLEU-4· 2017-06-12
    2.01
    best: 2.61 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonVIST-E
    CIDEr· 2017-06-12
    12.75
    best: 25.41 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonVIST-E
    METEOR· 2017-06-12
    6.91
    best: 15.55 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonVIST-E
    ROUGE-L· 2017-06-12
    18.23
    best: 23.61 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Multimodal Machine TranslationonMulti30K
    BLUE (DE-EN)· 2017-06-12
    29
    best: 32.3 (PS-KD)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonLSMDC-E
    BLEU-1· 2017-06-12
    15.35
    best: 18.52 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonLSMDC-E
    BLEU-3· 2017-06-12
    1.82
    best: 2.51 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonLSMDC-E
    BLEU-4· 2017-06-12
    0.76
    best: 1.13 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonLSMDC-E
    CIDEr· 2017-06-12
    9.32
    best: 12.41 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonLSMDC-E
    METEOR· 2017-06-12
    11.43
    best: 12.87 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonVIST-E
    BLEU-1· 2017-06-12
    17.18
    best: 22.87 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonVIST-E
    BLEU-2· 2017-06-12
    6.29
    best: 8.68 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonVIST-E
    BLEU-3· 2017-06-12
    3.07
    best: 4.38 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonVIST-E
    BLEU-4· 2017-06-12
    2.01
    best: 2.61 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonVIST-E
    CIDEr· 2017-06-12
    12.75
    best: 25.41 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonVIST-E
    METEOR· 2017-06-12
    6.91
    best: 15.55 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonVIST-E
    ROUGE-L· 2017-06-12
    18.23
    best: 23.61 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonLSMDC-E
    BLEU-1· 2017-06-12
    15.35
    best: 18.52 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonLSMDC-E
    BLEU-3· 2017-06-12
    1.82
    best: 2.51 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonLSMDC-E
    BLEU-4· 2017-06-12
    0.76
    best: 1.13 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonLSMDC-E
    CIDEr· 2017-06-12
    9.32
    best: 12.41 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonLSMDC-E
    METEOR· 2017-06-12
    11.43
    best: 12.87 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonVIST-E
    BLEU-1· 2017-06-12
    17.18
    best: 22.87 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonVIST-E
    BLEU-2· 2017-06-12
    6.29
    best: 8.68 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonVIST-E
    BLEU-3· 2017-06-12
    3.07
    best: 4.38 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonVIST-E
    BLEU-4· 2017-06-12
    2.01
    best: 2.61 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonVIST-E
    CIDEr· 2017-06-12
    12.75
    best: 25.41 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonVIST-E
    METEOR· 2017-06-12
    6.91
    best: 15.55 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonVIST-E
    ROUGE-L· 2017-06-12
    18.23
    best: 23.61 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Machine TranslationonIWSLT2014 German-English
    BLEU score· 2022-05-15
    35.1385
    best: 40.43 (PiNMT)
    Guidelines for the Regularization of Gammas in Batch Normalization for Deep Residual NetworksarXiv:2205.07260
  • Code GenerationonCodeSearchNet
    Smoothed BLEU-4· 2020-02-19
    14.31
    best: 15.99 (CodeBERT (MLM+RTD))
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code GenerationonCodeSearchNet - Python
    Smoothed BLEU-4· 2020-02-19
    13.44
    best: 20.39 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code GenerationonCodeSearchNet - Php
    Smoothed BLEU-4· 2020-02-19
    18.25
    best: 26.23 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code GenerationonCodeSearchNet - Java
    Smoothed BLEU-4· 2020-02-19
    12.57
    best: 21.87 (CodeTrans-MT-Large)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code GenerationonCodeSearchNet - Ruby
    Smoothed BLEU-4· 2020-02-19
    7.87
    best: 15.26 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Machine TranslationonWMT2016 English-German
    BLEU score· 2019-10-09
    26.7
    best: 40.68 (MADL)
    On the adequacy of untuned warmup for adaptive optimizationarXiv:1910.04209
  • Grammatical Error CorrectiononBEA-2019 (test)
    F0.5· 2019-07-02
    69
    best: 81.4 (Majority-voting ensemble on best 7 models)
    A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer LearningarXiv:1907.01256
  • Machine TranslationonWMT2014 English-French
    BLEU score· 2019-01-30
    40.5
    best: 46.4 (Transformer+BT (ADMIN init))
    Memory-Efficient Adaptive OptimizationarXiv:1901.11150
  • Grammatical Error CorrectiononCoNLL-2014 Shared Task
    F0.5· 2018-04-16
    55.8
    best: 72.8 (Ensembles of best 7 models + GRECO + GTP-rerank)
    Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation TaskarXiv:1804.05940
  • Grammatical Error CorrectiononJFLEG
    GLEU· 2018-04-16
    59.9
    best: 62.1 (VERNet)
    Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation TaskarXiv:1804.05940
  • Constituency ParsingonPenn Treebank
    F1 score· 2017-06-12
    92.7
    best: 96.43 (Hashing + XLNet)
    Attention Is All You NeedarXiv:1706.03762
  • Abstractive Text SummarizationonCNN / Daily Mail
    ROUGE-1· 2017-06-12
    39.5
    best: 48.18 (Scrambled code + broken (alter))
    Attention Is All You NeedarXiv:1706.03762
  • Abstractive Text SummarizationonCNN / Daily Mail
    ROUGE-2· 2017-06-12
    16.06
    best: 24.02 (Pegasus)
    Attention Is All You NeedarXiv:1706.03762
  • Abstractive Text SummarizationonCNN / Daily Mail
    ROUGE-L· 2017-06-12
    36.63
    best: 45.35 (Scrambled code + broken (alter))
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonLSMDC-E
    BLEU-2· 2017-06-12
    4.49
    best: 5.99 (MMT)
    Attention Is All You NeedarXiv:1706.03762
  • Data-to-Text GenerationonLSMDC-E
    ROUGE-L· 2017-06-12
    19.16
    best: 20.99 (MMT)
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonLSMDC-E
    BLEU-2· 2017-06-12
    4.49
    best: 5.99 (MMT)
    Attention Is All You NeedarXiv:1706.03762
  • Visual StorytellingonLSMDC-E
    ROUGE-L· 2017-06-12
    19.16
    best: 20.99 (MMT)
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonLSMDC-E
    BLEU-2· 2017-06-12
    4.49
    best: 5.99 (MMT)
    Attention Is All You NeedarXiv:1706.03762
  • Story GenerationonLSMDC-E
    ROUGE-L· 2017-06-12
    19.16
    best: 20.99 (MMT)
    Attention Is All You NeedarXiv:1706.03762
  • Grammatical Error CorrectiononBEA-2019 (test)
    F0.5
    69.5
    best: 81.4 (Majority-voting ensemble on best 7 models)
  • Abstractive Text Summarizationonvietnews
    Rouge-1
    57.56
    best: 67.8 (Kết quả nghiên cứu)
  • Abstractive Text Summarizationonvietnews
    Rouge-2
    24.25
    best: 34.24 (ViT5 large)
  • Abstractive Text Summarizationonvietnews
    Rouge-L
    35.53
    best: 43.55 (ViT5 large)

Adversarial20 results

  • Text GenerationonCodeSearchNet - JavaScript
    Smoothed BLEU-4· 2020-02-19
    25.61
    SOTA
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Text GenerationonLSMDC-E
    BLEU-1· 2017-06-12
    15.35
    best: 18.52 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonLSMDC-E
    BLEU-3· 2017-06-12
    1.82
    best: 2.51 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonLSMDC-E
    BLEU-4· 2017-06-12
    0.76
    best: 1.13 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonLSMDC-E
    CIDEr· 2017-06-12
    9.32
    best: 12.41 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonLSMDC-E
    METEOR· 2017-06-12
    11.43
    best: 12.87 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonVIST-E
    BLEU-1· 2017-06-12
    17.18
    best: 22.87 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonVIST-E
    BLEU-2· 2017-06-12
    6.29
    best: 8.68 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonVIST-E
    BLEU-3· 2017-06-12
    3.07
    best: 4.38 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonVIST-E
    BLEU-4· 2017-06-12
    2.01
    best: 2.61 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonVIST-E
    CIDEr· 2017-06-12
    12.75
    best: 25.41 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonVIST-E
    METEOR· 2017-06-12
    6.91
    best: 15.55 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonVIST-E
    ROUGE-L· 2017-06-12
    18.23
    best: 23.61 (MMT)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonCodeSearchNet
    Smoothed BLEU-4· 2020-02-19
    14.31
    best: 15.99 (CodeBERT (MLM+RTD))
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Text GenerationonCodeSearchNet - Python
    Smoothed BLEU-4· 2020-02-19
    13.44
    best: 20.39 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Text GenerationonCodeSearchNet - Php
    Smoothed BLEU-4· 2020-02-19
    18.25
    best: 26.23 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Text GenerationonCodeSearchNet - Java
    Smoothed BLEU-4· 2020-02-19
    12.57
    best: 21.87 (CodeTrans-MT-Large)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Text GenerationonCodeSearchNet - Ruby
    Smoothed BLEU-4· 2020-02-19
    7.87
    best: 15.26 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Text GenerationonLSMDC-E
    BLEU-2· 2017-06-12
    4.49
    best: 5.99 (MMT)
    Attention Is All You NeedarXiv:1706.03762
  • Text GenerationonLSMDC-E
    ROUGE-L· 2017-06-12
    19.16
    best: 20.99 (MMT)
    Attention Is All You NeedarXiv:1706.03762

Knowledge Base9 results

  • Text SummarizationonGigaWord
    ROUGE-1· 2017-06-12
    37.57
    best: 60.12 (OpenAI/o3-mini)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text SummarizationonGigaWord
    ROUGE-2· 2017-06-12
    18.9
    best: 54.22 (OpenAI/o3-mini)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text SummarizationonGigaWord
    ROUGE-L· 2017-06-12
    34.69
    best: 60.29 (Riple/Saanvi-v0.1)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Text SummarizationonCNN / Daily Mail
    ROUGE-1· 2017-06-12
    39.5
    best: 48.18 (Scrambled code + broken (alter))
    Attention Is All You NeedarXiv:1706.03762
  • Text SummarizationonCNN / Daily Mail
    ROUGE-2· 2017-06-12
    16.06
    best: 24.02 (Pegasus)
    Attention Is All You NeedarXiv:1706.03762
  • Text SummarizationonCNN / Daily Mail
    ROUGE-L· 2017-06-12
    36.63
    best: 45.35 (Scrambled code + broken (alter))
    Attention Is All You NeedarXiv:1706.03762
  • Text Summarizationonvietnews
    Rouge-1
    57.56
    best: 67.8 (Kết quả nghiên cứu)
  • Text Summarizationonvietnews
    Rouge-2
    24.25
    best: 34.24 (ViT5 large)
  • Text Summarizationonvietnews
    Rouge-L
    35.53
    best: 43.55 (ViT5 large)

Computer Vision9 results

  • Shape Representation Of 3D Point CloudsonScanObjectNN
    GFLOPs· 2017-06-12
    4.8
    best: 45 (PCM)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Shape Representation Of 3D Point CloudsonScanObjectNN
    Number of params (M)· 2017-06-12
    22.1
    best: 34.2 (PCM)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • 3D Point Cloud ClassificationonScanObjectNN
    GFLOPs· 2017-06-12
    4.8
    best: 45 (PCM)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • 3D Point Cloud ClassificationonScanObjectNN
    Number of params (M)· 2017-06-12
    22.1
    best: 34.2 (PCM)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • 3D Point Cloud ReconstructiononScanObjectNN
    GFLOPs· 2017-06-12
    4.8
    best: 45 (PCM)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • 3D Point Cloud ReconstructiononScanObjectNN
    Number of params (M)· 2017-06-12
    22.1
    best: 34.2 (PCM)
    SOTA
    Attention Is All You NeedarXiv:1706.03762
  • Shape Representation Of 3D Point CloudsonScanObjectNN
    Overall Accuracy (PB_T50_RS)· 2017-06-12
    77.24
    best: 92.64 (Mamba3D)
    Attention Is All You NeedarXiv:1706.03762
  • 3D Point Cloud ClassificationonScanObjectNN
    Overall Accuracy (PB_T50_RS)· 2017-06-12
    77.24
    best: 92.64 (Mamba3D)
    Attention Is All You NeedarXiv:1706.03762
  • 3D Point Cloud ReconstructiononScanObjectNN
    Overall Accuracy (PB_T50_RS)· 2017-06-12
    77.24
    best: 92.64 (Mamba3D)
    Attention Is All You NeedarXiv:1706.03762

Computer Code8 results

  • Code Documentation GenerationonCodeSearchNet - JavaScript
    Smoothed BLEU-4· 2020-02-19
    25.61
    SOTA
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Program SynthesisonGitHub-Python
    Accuracy (%)· 2021-06-11
    62
    best: 90.5 (Transformer + BIFI)
    Break-It-Fix-It: Unsupervised Learning for Program RepairarXiv:2106.06600
  • Program RepaironGitHub-Python
    Accuracy (%)· 2021-06-11
    62
    best: 90.5 (Transformer + BIFI)
    Break-It-Fix-It: Unsupervised Learning for Program RepairarXiv:2106.06600
  • Code Documentation GenerationonCodeSearchNet
    Smoothed BLEU-4· 2020-02-19
    14.31
    best: 15.99 (CodeBERT (MLM+RTD))
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code Documentation GenerationonCodeSearchNet - Python
    Smoothed BLEU-4· 2020-02-19
    13.44
    best: 20.39 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code Documentation GenerationonCodeSearchNet - Php
    Smoothed BLEU-4· 2020-02-19
    18.25
    best: 26.23 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code Documentation GenerationonCodeSearchNet - Java
    Smoothed BLEU-4· 2020-02-19
    12.57
    best: 21.87 (CodeTrans-MT-Large)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code Documentation GenerationonCodeSearchNet - Ruby
    Smoothed BLEU-4· 2020-02-19
    7.87
    best: 15.26 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155

Medical6 results

  • Language ModellingonLRA
    Avg· 2020-11-08
    54.39
    best: 87.46 (S5)
    SOTA
    Long Range Arena: A Benchmark for Efficient TransformersarXiv:2011.04006
  • Language ModellingonLRA
    ListOps· 2020-11-08
    36.37
    best: 62.15 (S5)
    SOTA
    Long Range Arena: A Benchmark for Efficient TransformersarXiv:2011.04006
  • Language ModellingonLRA
    Image· 2020-11-08
    42.44
    best: 88.65 (S4)
    Long Range Arena: A Benchmark for Efficient TransformersarXiv:2011.04006
  • Language ModellingonLRA
    Pathfinder· 2020-11-08
    71.4
    best: 95.33 (S5)
    Long Range Arena: A Benchmark for Efficient TransformersarXiv:2011.04006
  • Language ModellingonLRA
    Retrieval· 2020-11-08
    57.46
    best: 91.4 (S5)
    Long Range Arena: A Benchmark for Efficient TransformersarXiv:2011.04006
  • Language ModellingonLRA
    Text· 2020-11-08
    64.27
    best: 89.31 (S5)
    Long Range Arena: A Benchmark for Efficient TransformersarXiv:2011.04006

Audio2 results

  • Speech RecognitiononLibriSpeech test-clean
    Word Error Rate (WER)· uses extra data· 2019-09-13
    2.6
    best: 0.985 (United Med ASR)
    A Comparative Study on Transformer vs RNN in Speech ApplicationsarXiv:1909.06317
  • Speech RecognitiononLibriSpeech test-other
    Word Error Rate (WER)· uses extra data· 2019-09-13
    5.7
    best: 2.48 (SAMBA ASR)
    A Comparative Study on Transformer vs RNN in Speech ApplicationsarXiv:1909.06317

Miscellaneous1 result

  • Crop Yield Predictionon2018 Syngenta (2016 val)
    RMSE· 2019-02-07
    9.28
    SOTA
    Crop Yield Prediction Using Deep Neural NetworksarXiv:1902.02860

Music1 result

  • Music ModelingonNottingham
    NLL· 2019-07-12
    3.34
    best: 4.05 (RNN)
    R-Transformer: Recurrent Neural Network Enhanced TransformerarXiv:1907.05572