TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Technical Report: Impact of Position Bias on Language Mode...

Technical Report: Impact of Position Bias on Language Models in Token Classification

Mehdi Ben Amor, Michael Granitzer, Jelena Mitrović

2023-04-26Token ClassificationPOSPart-Of-Speech Taggingnamed-entity-recognitionNamed Entity RecognitionNERNamed Entity Recognition (NER)POS Tagging
PaperPDFCode(official)Code(official)

Abstract

Language Models (LMs) have shown state-of-the-art performance in Natural Language Processing (NLP) tasks. Downstream tasks such as Named Entity Recognition (NER) or Part-of-Speech (POS) tagging are known to suffer from data imbalance issues, particularly regarding the ratio of positive to negative examples and class disparities. This paper investigates an often-overlooked issue of encoder models, specifically the position bias of positive examples in token classification tasks. For completeness, we also include decoders in the evaluation. We evaluate the impact of position bias using different position embedding techniques, focusing on BERT with Absolute Position Embedding (APE), Relative Position Embedding (RPE), and Rotary Position Embedding (RoPE). Therefore, we conduct an in-depth evaluation of the impact of position bias on the performance of LMs when fine-tuned on token classification benchmarks. Our study includes CoNLL03 and OntoNote5.0 for NER, English Tree Bank UD\_en, and TweeBank for POS tagging. We propose an evaluation approach to investigate position bias in transformer models. We show that LMs can suffer from this bias with an average drop ranging from 3\% to 9\% in their performance. To mitigate this effect, we propose two methods: Random Position Shifting and Context Perturbation, that we apply on batches during the training process. The results show an improvement of $\approx$ 2\% in the performance of the model on CoNLL03, UD\_en, and TweeBank.

Related Papers

Flippi: End To End GenAI Assistant for E-Commerce2025-07-08Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models2025-06-28LingoLoop Attack: Trapping MLLMs via Linguistic Context and State Entrapment into Endless Loops2025-06-17Hybrid Meta-learners for Estimating Heterogeneous Treatment Effects2025-06-16Improving Named Entity Transcription with Contextual LLM-based Revision2025-06-12Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMs2025-06-11Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering2025-06-05Dissecting Bias in LLMs: A Mechanistic Interpretability Perspective2025-06-05