SciHTC

TextsIntroduced 2022-11-05

SciHTC is a dataset for hierarchical multi-label text classification (HMLTC) of scientific papers which contains 186,160 papers and 1,233 categories from the ACM CCS tree.

Source: Hierarchical Multi-Label Classification of Scientific Documents

Image Source: https://arxiv.org/pdf/2211.02810v1.pdf