SummScreen
Introduced 2021-04-14
SummScreen is a dataset for abstractive screenplay summarization. It consists of pairs of TV series transcripts and human-written recaps. This dataset provides a challenging testbed for abstractive summarization for several reasons:
- Plot details are often expressed indirectly in character dialogues and may be scattered across the entirety of the transcript.
- These details must be found and integrated to form the succinct plot descriptions in the recaps.
- TV scripts contain content that does not directly pertain to the central plot but rather serves to develop characters or provide comic relief. This information is rarely contained in recaps.