SemBleu: A Robust Metric for AMR Parsing Evaluation

Linfeng Song, Daniel Gildea

2019-05-26ACL 2019 7Graph Matching AMR Parsing

Abstract

Evaluating AMR parsing accuracy involves comparing pairs of AMR graphs. The major evaluation metric, SMATCH (Cai and Knight, 2013), searches for one-to-one mappings between the nodes of two AMRs with a greedy hill-climbing algorithm, which leads to search errors. We propose SEMBLEU, a robust metric that extends BLEU (Papineni et al., 2002) to AMRs. It does not suffer from search errors and considers non-local correspondences in addition to local ones. SEMBLEU is fully content-driven and punishes situations where a system's output does not preserve most information from the input. Preliminary experiments on both sentence and corpus levels show that SEMBLEU has slightly higher consistency with human judgments than SMATCH. Our code is available at http://github.com/freesunshine0316/sembleu.

Results

Task	Dataset	Metric	Value	Model
Graph Matching	RARE	Spearman Correlation	94.83	SemBleu

Related Papers

Probing Neural Topology of Large Language Models2025-06-01 PackHero: A Scalable Graph-based Approach for Efficient Packer Identification2025-05-31 Learning without Isolation: Pathway Protection for Continual Learning2025-05-24 Improving Chemical Understanding of LLMs via SMILES Parsing2025-05-22 Cross-modal Knowledge Transfer Learning as Graph Matching Based on Optimal Transport for ASR2025-05-19 Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward2025-05-18 Reassessing Graph Linearization for Sequence-to-sequence AMR Parsing: On the Advantages and Limitations of Triple-Based Encoding2025-05-13 Tempo: Application-aware LLM Serving with Mixed SLO Requirements2025-04-24