TAC 2010

Texts

TAC 2010 is a dataset for summarization that consists of 44 topics, each of which is associated with a set of 10 documents. The test dataset is composed of approximately 44 topics, divided into five categories: Accidents and Natural Disasters, Attacks, Health and Safety, Endangered Resources, Investigations and Trials.

Source: Better Summarization Evaluation with Word Embeddings for ROUGE Image Source: https://tac.nist.gov//2010/Summarization/Guided-Summ.2010.guidelines.html