TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/The University of Sydney's Machine Translation System for ...

The University of Sydney's Machine Translation System for WMT19

Liang Ding, DaCheng Tao

2019-06-30WS 2019 8Machine TranslationRerankingData AugmentationTranslation
PaperPDF

Abstract

This paper describes the University of Sydney's submission of the WMT 2019 shared news translation task. We participated in the Finnish$\rightarrow$English direction and got the best BLEU(33.0) score among all the participants. Our system is based on the self-attentional Transformer networks, into which we integrated the most recent effective strategies from academic research (e.g., BPE, back translation, multi-features data selection, data augmentation, greedy model ensemble, reranking, ConMBR system combination, and post-processing). Furthermore, we propose a novel augmentation method $Cycle Translation$ and a data mixture strategy $Big$/$Small$ parallel construction to entirely exploit the synthetic corpus. Extensive experiments show that adding the above techniques can make continuous improvements of the BLEU scores, and the best result outperforms the baseline (Transformer ensemble model trained with the original parallel corpus) by approximately 5.3 BLEU score, achieving the state-of-the-art performance.

Results

TaskDatasetMetricValueModel
Machine TranslationWMT2019 Finnish-EnglishBLEU34.1CT+B/S construction
Machine TranslationWMT2016 Finnish-EnglishBLEU32.4CT+B/S construction
Machine TranslationWMT2017 Finnish-EnglishBLEU35.5CT+B/S construction
Machine TranslationWMT 2018 Finnish-EnglishBLEU26.5CT+B/S construction

Related Papers

Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16Data Augmentation in Time Series Forecasting through Inverted Framework2025-07-15Function-to-Style Guidance of LLMs for Code Translation2025-07-15Iceberg: Enhancing HLS Modeling with Synthetic Data2025-07-14AI-Enhanced Pediatric Pneumonia Detection: A CNN-Based Approach Using Data Augmentation and Generative Adversarial Networks (GANs)2025-07-13