T-REx

Creative Commons Attribution-ShareAlike 4.0 International License

A dataset of large scale alignments between Wikipedia abstracts and Wikidata triples. T-REx consists of 11 million triples aligned with 3.09 million Wikipedia abstracts (6.2 million sentences).

Source: T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples