TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/XNLI: Evaluating Cross-lingual Sentence Representations

XNLI: Evaluating Cross-lingual Sentence Representations

Alexis Conneau, Guillaume Lample, Ruty Rinott, Adina Williams, Samuel R. Bowman, Holger Schwenk, Veselin Stoyanov

2018-09-13EMNLP 2018 10Machine TranslationNatural Language InferenceTranslationCross-Lingual Natural Language Inference
PaperPDFCodeCodeCodeCodeCodeCodeCodeCodeCode

Abstract

State-of-the-art natural language processing systems rely on supervision in the form of annotated data to learn competent models. These models are generally trained on data in a single language (usually English), and cannot be directly used beyond that language. Since collecting data in every language is not realistic, there has been a growing interest in cross-lingual language understanding (XLU) and low-resource cross-language transfer. In this work, we construct an evaluation set for XLU by extending the development and test sets of the Multi-Genre Natural Language Inference Corpus (MultiNLI) to 15 languages, including low-resource languages such as Swahili and Urdu. We hope that our dataset, dubbed XNLI, will catalyze research in cross-lingual sentence understanding by providing an informative standard evaluation task. In addition, we provide several baselines for multilingual sentence understanding, including two based on machine translation systems, and two that use parallel data to train aligned multilingual bag-of-words and LSTM encoders. We find that XNLI represents a practical and challenging evaluation suite, and that directly translating the test data yields the best performance among available baselines.

Results

TaskDatasetMetricValueModel
Natural Language InferenceXNLI FrenchAccuracy68.3BiLSTM-max

Related Papers

A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15Function-to-Style Guidance of LLMs for Code Translation2025-07-15Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation2025-07-09Pun Intended: Multi-Agent Translation of Wordplay with Contrastive Learning and Phonetic-Semantic Embeddings2025-07-09DS@GT at CheckThat! 2025: Evaluating Context and Tokenization Strategies for Numerical Fact Verification2025-07-08Unconditional Diffusion for Generative Sequential Recommendation2025-07-08GRAFT: A Graph-based Flow-aware Agentic Framework for Document-level Machine Translation2025-07-04