Perseus
ImagesIntroduced 2022-12-01
Perseus is a dataset for Cross-Lingual Summarization (CLS) which collects about 94K Chinese scientific documents paired with English summaries. The average length of documents in Perseus is more than two thousand tokens.
Source: Long-Document Cross-Lingual Summarization
Image Source: https://arxiv.org/pdf/2212.00586v1.pdf