CiteWorth

TextsCC BY-NC 2.0Introduced 2021-05-23

CiteWorth is a a large, contextualized, rigorously cleaned labelled dataset for cite-worthiness detection built from a massive corpus of extracted plain-text scientific documents.