Comparing and combining some popular NER approaches on Biomedical tasks

Harsh Verma, Sabine Bergler, Narjesossadat Tahaei

2023-05-30Nested Named Entity Recognition NER All Named Entity Recognition (NER)

Abstract

We compare three simple and popular approaches for NER: 1) SEQ (sequence-labeling with a linear token classifier) 2) SeqCRF (sequence-labeling with Conditional Random Fields), and 3) SpanPred (span-prediction with boundary token embeddings). We compare the approaches on 4 biomedical NER tasks: GENIA, NCBI-Disease, LivingNER (Spanish), and SocialDisNER (Spanish). The SpanPred model demonstrates state-of-the-art performance on LivingNER and SocialDisNER, improving F1 by 1.3 and 0.6 F1 respectively. The SeqCRF model also demonstrates state-of-the-art performance on LivingNER and SocialDisNER, improving F1 by 0.2 F1 and 0.7 respectively. The SEQ model is competitive with the state-of-the-art on the LivingNER dataset. We explore some simple ways of combining the three approaches. We find that majority voting consistently gives high precision and high F1 across all 4 datasets. Lastly, we implement a system that learns to combine the predictions of SEQ and SpanPred, generating systems that consistently give high recall and high F1 across all 4 datasets. On the GENIA dataset, we find that our learned combiner system significantly boosts F1(+1.2) and recall(+2.1) over the systems being combined. We release all the well-documented code necessary to reproduce all systems at https://github.com/flyingmothman/bionlp.

Results

Task	Dataset	Metric	Value	Model
Named Entity Recognition (NER)	NCBI-disease	F1	89.6	SpanModel + SequenceLabelingModel
Named Entity Recognition (NER)	GENIA	F1	78.3	SpanModel + SequenceLabelingModel

Comparing and combining some popular NER approaches on Biomedical tasks

Abstract

Results

Related Papers

Comparing and combining some popular NER approaches on Biomedical tasks

Abstract

Results

Related Papers