TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Multi-Source Cross-Lingual Model Transfer: Learning What t...

Multi-Source Cross-Lingual Model Transfer: Learning What to Share

Xilun Chen, Ahmed Hassan Awadallah, Hany Hassan, Wei Wang, Claire Cardie

2018-10-08ACL 2019 7Text ClassificationCross-Lingual TransferTransfer LearningCross-Lingual NERtext-classification
PaperPDFCode(official)

Abstract

Modern NLP applications have enjoyed a great boost utilizing neural networks models. Such deep neural models, however, are not applicable to most human languages due to the lack of annotated training data for various NLP tasks. Cross-lingual transfer learning (CLTL) is a viable method for building NLP models for a low-resource target language by leveraging labeled data from other (source) languages. In this work, we focus on the multilingual transfer setting where training data in multiple source languages is leveraged to further boost target language performance. Unlike most existing methods that rely only on language-invariant features for CLTL, our approach coherently utilizes both language-invariant and language-specific features at instance level. Our model leverages adversarial networks to learn language-invariant features, and mixture-of-experts models to dynamically exploit the similarity between the target language and each individual source language. This enables our model to learn effectively what to share between various languages in the multilingual setup. Moreover, when coupled with unsupervised multilingual embeddings, our model can operate in a zero-resource setting where neither target language training data nor cross-lingual resources are available. Our model achieves significant performance gains over prior art, as shown in an extensive set of experiments over multiple text classification and sequence tagging tasks including a large-scale industry dataset.

Results

TaskDatasetMetricValueModel
Cross-LingualCoNLL DutchF172.4MAN-MoE+CharCNN+UMWE
Cross-LingualCoNLL GermanF156MAN-MoE+CharCNN+UMWE
Cross-LingualCoNLL SpanishF173.5MAN-MoE+CharCNN+UMWE
Cross-Lingual TransferCoNLL DutchF172.4MAN-MoE+CharCNN+UMWE
Cross-Lingual TransferCoNLL GermanF156MAN-MoE+CharCNN+UMWE
Cross-Lingual TransferCoNLL SpanishF173.5MAN-MoE+CharCNN+UMWE

Related Papers

RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction2025-07-18Making Language Model a Hierarchical Classifier and Generator2025-07-17Enhancing Cross-task Transfer of Large Language Models via Activation Steering2025-07-17Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows2025-07-16HanjaBridge: Resolving Semantic Ambiguity in Korean LLMs via Hanja-Augmented Pre-Training2025-07-15Robust-Multi-Task Gradient Boosting2025-07-15Calibrated and Robust Foundation Models for Vision-Language and Medical Image Tasks Under Distribution Shift2025-07-12