NText

TextsIntroduced 2020-03-30

NText is an eight million words dataset extracted and preprocessed from nuclear research papers and thesis.

Source: NukeBERT: A Pre-trained language model for Low Resource Nuclear Domain