Deep Reinforcement Learning for Mention-Ranking Coreference Models

Kevin Clark, Christopher D. Manning

2016-09-27EMNLP 2016 11coreference-resolution Reinforcement Learning Coreference Resolution reinforcement-learning

Abstract

Coreference resolution systems are typically trained with heuristic loss functions that require careful tuning. In this paper we instead apply reinforcement learning to directly optimize a neural mention-ranking model for coreference evaluation metrics. We experiment with two approaches: the REINFORCE policy gradient algorithm and a reward-rescaled max-margin objective. We find the latter to be more effective, resulting in significant improvements over the current state-of-the-art on the English and Chinese portions of the CoNLL 2012 Shared Task.

Results

Task	Dataset	Metric	Value	Model
Coreference Resolution	OntoNotes	F1	65.73	Reward Rescaling

Related Papers

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18 VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17 Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17 Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback2025-07-17 VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17 QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17 Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17 Autonomous Resource Management in Microservice Systems via Reinforcement Learning2025-07-17