TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Frustratingly Easy Label Projection for Cross-lingual Tran...

Frustratingly Easy Label Projection for Cross-lingual Transfer

Yang Chen, Chao Jiang, Alan Ritter, Wei Xu

2022-11-28Question AnsweringCross-Lingual TransferEvent ExtractionTranslationNERCross-Lingual NERWord Alignment
PaperPDFCode(official)Code

Abstract

Translating training data into many languages has emerged as a practical solution for improving cross-lingual transfer. For tasks that involve span-level annotations, such as information extraction or question answering, an additional label projection step is required to map annotated spans onto the translated texts. Recently, a few efforts have utilized a simple mark-then-translate method to jointly perform translation and projection by inserting special markers around the labeled spans in the original sentence. However, as far as we are aware, no empirical analysis has been conducted on how this approach compares to traditional annotation projection based on word alignment. In this paper, we present an extensive empirical study across 57 languages and three tasks (QA, NER, and Event Extraction) to evaluate the effectiveness and limitations of both methods, filling an important gap in the literature. Experimental results show that our optimized version of mark-then-translate, which we call EasyProject, is easily applied to many languages and works surprisingly well, outperforming the more complex word alignment-based methods. We analyze several key factors that affect the end-task performance, and show EasyProject works well because it can accurately preserve label span boundaries after translation. We will publicly release all our code and data.

Results

TaskDatasetMetricValueModel
Cross-LingualMasakhaNER2.0Akan/Twi65.3EasyProject
Cross-LingualMasakhaNER2.0Bambara45.8EasyProject
Cross-LingualMasakhaNER2.0Chichewa75.3EasyProject
Cross-LingualMasakhaNER2.0Ewe78.5EasyProject
Cross-LingualMasakhaNER2.0Fon61.4EasyProject
Cross-LingualMasakhaNER2.0Hausa72.2EasyProject
Cross-LingualMasakhaNER2.0Igbo65.6EasyProject
Cross-LingualMasakhaNER2.0Kinyarwanda71EasyProject
Cross-LingualMasakhaNER2.0Kiswahili83.6EasyProject
Cross-LingualMasakhaNER2.0Luganda76.7EasyProject
Cross-LingualMasakhaNER2.0Luo50.2EasyProject
Cross-LingualMasakhaNER2.0Mossi53.1EasyProject
Cross-LingualMasakhaNER2.0Setswana74EasyProject
Cross-LingualMasakhaNER2.0Wolof58.9EasyProject
Cross-LingualMasakhaNER2.0Yoruba36.8EasyProject
Cross-LingualMasakhaNER2.0chiShona55.9EasyProject
Cross-LingualMasakhaNER2.0isiXhosa71.1EasyProject
Cross-LingualMasakhaNER2.0isiZulu73EasyProject
Cross-Lingual TransferMasakhaNER2.0Akan/Twi65.3EasyProject
Cross-Lingual TransferMasakhaNER2.0Bambara45.8EasyProject
Cross-Lingual TransferMasakhaNER2.0Chichewa75.3EasyProject
Cross-Lingual TransferMasakhaNER2.0Ewe78.5EasyProject
Cross-Lingual TransferMasakhaNER2.0Fon61.4EasyProject
Cross-Lingual TransferMasakhaNER2.0Hausa72.2EasyProject
Cross-Lingual TransferMasakhaNER2.0Igbo65.6EasyProject
Cross-Lingual TransferMasakhaNER2.0Kinyarwanda71EasyProject
Cross-Lingual TransferMasakhaNER2.0Kiswahili83.6EasyProject
Cross-Lingual TransferMasakhaNER2.0Luganda76.7EasyProject
Cross-Lingual TransferMasakhaNER2.0Luo50.2EasyProject
Cross-Lingual TransferMasakhaNER2.0Mossi53.1EasyProject
Cross-Lingual TransferMasakhaNER2.0Setswana74EasyProject
Cross-Lingual TransferMasakhaNER2.0Wolof58.9EasyProject
Cross-Lingual TransferMasakhaNER2.0Yoruba36.8EasyProject
Cross-Lingual TransferMasakhaNER2.0chiShona55.9EasyProject
Cross-Lingual TransferMasakhaNER2.0isiXhosa71.1EasyProject
Cross-Lingual TransferMasakhaNER2.0isiZulu73EasyProject

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Enhancing Cross-task Transfer of Large Language Models via Activation Steering2025-07-17A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility2025-07-16