TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Beto, Bentz, Becas: The Surprising Cross-Lingual Effective...

Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT

Shijie Wu, Mark Dredze

2019-04-19IJCNLP 2019 11POSNatural Language InferenceCross-Lingual TransferNERCross-Lingual NERDocument ClassificationPOS TaggingDependency ParsingZero-Shot Cross-Lingual Transfer
PaperPDFCode(official)Code

Abstract

Pretrained contextual representation models (Peters et al., 2018; Devlin et al., 2018) have pushed forward the state-of-the-art on many NLP tasks. A new release of BERT (Devlin, 2018) includes a model simultaneously pretrained on 104 languages with impressive performance for zero-shot cross-lingual transfer on a natural language inference task. This paper explores the broader cross-lingual potential of mBERT (multilingual) as a zero shot language transfer model on 5 NLP tasks covering a total of 39 languages from various language families: NLI, document classification, NER, POS tagging, and dependency parsing. We compare mBERT with the best-published methods for zero-shot cross-lingual transfer and find mBERT competitive on each task. Additionally, we investigate the most effective strategy for utilizing mBERT in this manner, determine to what extent mBERT generalizes away from language specific features, and measure factors that influence cross-lingual transfer.

Results

TaskDatasetMetricValueModel
Cross-LingualCoNLL DutchF177.57mBERT
Cross-LingualCoNLL GermanF169.56mBERT
Cross-LingualCoNLL SpanishF174.96mBERT
Cross-Lingual TransferCoNLL DutchF177.57mBERT
Cross-Lingual TransferCoNLL GermanF169.56mBERT
Cross-Lingual TransferCoNLL SpanishF174.96mBERT

Related Papers

Enhancing Cross-task Transfer of Large Language Models via Activation Steering2025-07-17LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15HanjaBridge: Resolving Semantic Ambiguity in Korean LLMs via Hanja-Augmented Pre-Training2025-07-15DS@GT at CheckThat! 2025: Evaluating Context and Tokenization Strategies for Numerical Fact Verification2025-07-08Flippi: End To End GenAI Assistant for E-Commerce2025-07-08Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models2025-06-28ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation2025-06-27LingoLoop Attack: Trapping MLLMs via Linguistic Context and State Entrapment into Endless Loops2025-06-17