TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Machine Translation/WMT2014 English-French

Machine Translation on WMT2014 English-French

Metric: BLEU score (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕BLEU score▼Extra DataPaperDate↕Code
1Transformer+BT (ADMIN init)46.4YesVery Deep Transformers for Neural Machine Transl...2020-08-18Code
2Noisy back-translation45.6YesUnderstanding Back-Translation at Scale2018-08-28Code
3mRASP+Fine-Tune44.3YesPre-training Multilingual Neural Machine Transla...2020-10-07Code
4Transformer + R-Drop43.95NoR-Drop: Regularized Dropout for Neural Networks2021-06-28Code
5Transformer (ADMIN init)43.8NoVery Deep Transformers for Neural Machine Transl...2020-08-18Code
6Admin43.8NoUnderstanding the Difficulty of Training Transfo...2020-04-17Code
7BERT-fused NMT43.78YesIncorporating BERT into Neural Machine Translation2020-02-17Code
8MUSE(Paralllel Multi-scale Attention)43.5NoMUSE: Parallel Multi-Scale Attention for Sequenc...2019-11-17Code
9T543.4YesExploring the Limits of Transfer Learning with a...2019-10-23Code
10Local Joint Self-attention43.3NoJoint Source-Target Self Attention with Locality...2019-05-16Code
11Depth Growing43.27NoDepth Growing for Neural Machine Translation2019-07-03Code
12Transformer Big43.2NoScaling Neural Machine Translation2018-06-01Code
13DynamicConv43.2NoPay Less Attention with Lightweight and Dynamic ...2019-01-29Code
14TaLK Convolutions43.2NoTime-aware Large Kernel Convolutions2020-02-08Code
15LightConv43.1NoPay Less Attention with Lightweight and Dynamic ...2019-01-29Code
16FLOATER-large42.7NoLearning to Encode Position for Transformer with...2020-03-13Code
17OmniNetP42.6NoOmniNet: Omnidirectional Representations from Tr...2021-03-01Code
18Transformer Big + MoS42.1NoFast and Simple Mixture of Softmaxes with BPE an...2018-09-25Code
19T2R + Pretrain42.1NoFinetuning Pretrained Transformers into RNNs2021-03-24Code
20Synthesizer (Random + Vanilla)41.85NoSynthesizer: Rethinking Self-Attention in Transf...2020-05-02Code
21Hardware Aware Transformer41.8NoHAT: Hardware-Aware Transformers for Efficient N...2020-05-28Code
22Transformer (big) + Relative Position Representations41.5NoSelf-Attention with Relative Position Representa...2018-03-06Code
23Stack 4-layer RNNSearch + Dual Learning + Deliberation Network41.5No---
24Weighted Transformer (large)41.4NoWeighted Transformer Network for Machine Transla...2017-11-06Code
25ConvS2S (ensemble)41.3NoConvolutional Sequence to Sequence Learning2017-05-08Code
26Evolved Transformer Big41.3NoThe Evolved Transformer2019-01-30Code
27RNMT+41NoThe Best of Both Worlds: Combining Recent Advanc...2018-04-26Code
28Transformer Big41YesAttention Is All You Need2017-06-12Code
29Evolved Transformer Base40.6NoThe Evolved Transformer2019-01-30Code
30ResMLP-1240.6NoResMLP: Feedforward networks for image classific...2021-05-07Code
31MoE40.56NoOutrageously Large Neural Networks: The Sparsely...2017-01-23Code
32Transformer40.5NoMemory-Efficient Adaptive Optimization2019-01-30Code
33ConvS2S40.46NoConvolutional Sequence to Sequence Learning2017-05-08Code
34ResMLP-640.3NoResMLP: Feedforward networks for image classific...2021-05-07Code
35TransformerBase + AutoDropout40NoAutoDropout: Learning Dropout Patterns to Regula...2021-01-05Code
36GNMT+RL39.9NoGoogle's Neural Machine Translation System: Brid...2016-09-26Code
37Lite Transformer39.6NoLite Transformer with Long-Short Range Attention2020-04-24Code
38Deep-Att + PosUnk39.2NoDeep Recurrent Models with Fast-Forward Connecti...2016-06-14Code
39Rfa-Gate-arccos39.2NoRandom Feature Attention2021-03-03-
40Transformer Base38.1NoAttention Is All You Need2017-06-12Code
41LSTM6 + PosUnk37.5NoAddressing the Rare Word Problem in Neural Machi...2014-10-30Code
42PBMT37No---
43SMT+LSTM536.5NoSequence to Sequence Learning with Neural Networks2014-09-10Code
44RNN-search50*36.2NoNeural Machine Translation by Jointly Learning t...2014-09-01Code
45Deep-Att35.9NoDeep Recurrent Models with Fast-Forward Connecti...2016-06-14Code
46Deep Convolutional Encoder; single-layer decoder35.7NoA Convolutional Encoder Model for Neural Machine...2016-11-07Code
47LSTM34.8NoSequence to Sequence Learning with Neural Networks2014-09-10Code
48CSLM + RNN + WP34.54NoLearning Phrase Representations using RNN Encode...2014-06-03Code
49FLAN 137B (zero-shot)33.9NoFinetuned Language Models Are Zero-Shot Learners2021-09-03Code
50FLAN 137B (few-shot, k=9)33.8NoFinetuned Language Models Are Zero-Shot Learners2021-09-03Code
51Regularized LSTM29.03NoRecurrent Neural Network Regularization2014-09-08Code
52Unsupervised PBSMT28.11NoPhrase-Based & Neural Unsupervised Machine Trans...2018-04-20Code
53PBSMT + NMT27.6NoPhrase-Based & Neural Unsupervised Machine Trans...2018-04-20Code
54GRU+Attention26.4NoCan Active Memory Replace Attention?2016-10-27Code
55SMT + iterative backtranslation (unsupervised)26.22NoUnsupervised Statistical Machine Translation2018-09-04Code
56Unsupervised NMT + Transformer25.14NoPhrase-Based & Neural Unsupervised Machine Trans...2018-04-20Code
57Unsupervised attentional encoder-decoder + BPE14.36NoUnsupervised Neural Machine Translation2017-10-30Code