TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/InfoXLM: An Information-Theoretic Framework for Cross-Ling...

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, He-Yan Huang, Ming Zhou

2020-07-15NAACL 2021 4Cross-Lingual TransferContrastive LearningLanguage Modelling
PaperPDFCodeCodeCodeCode

Abstract

In this work, we present an information-theoretic framework that formulates cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts. The unified view helps us to better understand the existing methods for learning cross-lingual representations. More importantly, inspired by the framework, we propose a new pre-training task based on contrastive learning. Specifically, we regard a bilingual sentence pair as two views of the same meaning and encourage their encoded representations to be more similar than the negative examples. By leveraging both monolingual and parallel corpora, we jointly train the pretext tasks to improve the cross-lingual transferability of pre-trained models. Experimental results on several benchmarks show that our approach achieves considerably better performance. The code and pre-trained models are available at https://aka.ms/infoxlm.

Results

TaskDatasetMetricValueModel
Cross-LingualXTREMEAvg80.7T-ULRv2 + StableTune
Cross-LingualXTREMEQuestion Answering72.9T-ULRv2 + StableTune
Cross-LingualXTREMESentence Retrieval89.3T-ULRv2 + StableTune
Cross-LingualXTREMESentence-pair Classification88.8T-ULRv2 + StableTune
Cross-LingualXTREMEStructured Prediction75.4T-ULRv2 + StableTune
Cross-Lingual TransferXTREMEAvg80.7T-ULRv2 + StableTune
Cross-Lingual TransferXTREMEQuestion Answering72.9T-ULRv2 + StableTune
Cross-Lingual TransferXTREMESentence Retrieval89.3T-ULRv2 + StableTune
Cross-Lingual TransferXTREMESentence-pair Classification88.8T-ULRv2 + StableTune
Cross-Lingual TransferXTREMEStructured Prediction75.4T-ULRv2 + StableTune

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Enhancing Cross-task Transfer of Large Language Models via Activation Steering2025-07-17SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation2025-07-17Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17