Hansel
TextsCC-BY-SAIntroduced 2022-07-26
Hansel is a human-annotated Chinese entity linking (EL) dataset, focusing on tail entities and emerging entities:
-
The test set contains Few-shot (FS) and zero-shot (ZS) slices, has 10K examples and uses Wikidata as the corresponding knowledge base, useful for testing Chinese/multilingual EL systems' generalization ability to tail and emerging entities.
-
The training and validation sets are from Wikipedia hyperlinks, useful for large-scale pretraining of Chinese EL systems.