TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/English Intermediate-Task Training Improves Zero-Shot Cros...

English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too

Jason Phang, Iacer Calixto, Phu Mon Htut, Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Samuel R. Bowman

2020-05-26Asian Chapter of the Association for Computational Linguistics 2020Question AnsweringSentence RetrievalCross-Lingual TransferXLM-RRetrievalZero-Shot Cross-Lingual Transfer
PaperPDF

Abstract

Intermediate-task training---fine-tuning a pretrained model on an intermediate task before fine-tuning again on the target task---often improves model performance substantially on language understanding tasks in monolingual English settings. We investigate whether English intermediate-task training is still helpful on non-English target tasks. Using nine intermediate language-understanding tasks, we evaluate intermediate-task transfer in a zero-shot cross-lingual setting on the XTREME benchmark. We see large improvements from intermediate training on the BUCC and Tatoeba sentence retrieval tasks and moderate improvements on question-answering target tasks. MNLI, SQuAD and HellaSwag achieve the best overall results as intermediate tasks, while multi-task intermediate offers small additional improvements. Using our best intermediate-task models for each target task, we obtain a 5.4 point improvement over XLM-R Large on the XTREME benchmark, setting the state of the art as of June 2020. We also investigate continuing multilingual MLM during intermediate-task training and using machine-translated intermediate-task data, but neither consistently outperforms simply performing English intermediate-task training.

Results

TaskDatasetMetricValueModel
Cross-LingualXTREMEAvg73.5X-STILTs
Cross-LingualXTREMEQuestion Answering67.2X-STILTs
Cross-LingualXTREMESentence Retrieval76.5X-STILTs
Cross-LingualXTREMESentence-pair Classification83.9X-STILTs
Cross-LingualXTREMEStructured Prediction69.4X-STILTs
Cross-Lingual TransferXTREMEAvg73.5X-STILTs
Cross-Lingual TransferXTREMEQuestion Answering67.2X-STILTs
Cross-Lingual TransferXTREMESentence Retrieval76.5X-STILTs
Cross-Lingual TransferXTREMESentence-pair Classification83.9X-STILTs
Cross-Lingual TransferXTREMEStructured Prediction69.4X-STILTs

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Enhancing Cross-task Transfer of Large Language Models via Activation Steering2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17MCoT-RE: Multi-Faceted Chain-of-Thought and Re-Ranking for Training-Free Zero-Shot Composed Image Retrieval2025-07-17