Wen Xiao, Iz Beltagy, Giuseppe Carenini, Arman Cohan
We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data. PRIMERA uses our newly proposed pre-training objective designed to teach the model to connect and aggregate information across documents. It also uses efficient encoder-decoder transformers to simplify the processing of concatenated input documents. With extensive experiments on 6 multi-document summarization datasets from 3 different domains on zero-shot, few-shot and full-supervised settings, PRIMERA outperforms current state-of-the-art dataset-specific and pre-trained models on most of these settings with large margins. The code and pre-trained models can be found at \url{https://github.com/allenai/PRIMER}.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Text Generation | Multi-News | ROUGE-1 | 49.9 | PRIMER |
| Text Generation | Multi-News | ROUGE-2 | 21.1 | PRIMER |
| Text Generation | Multi-News | ROUGE-L | 25.9 | PRIMER |
| Text Generation | WCEP | ROUGE-1 | 46.1 | PRIMER |
| Text Generation | WCEP | ROUGE-2 | 25.2 | PRIMER |
| Text Generation | WCEP | ROUGE-L | 37.9 | PRIMER |
| Text Summarization | arXiv Summarization Dataset | ROUGE-1 | 47.6 | PRIMER |
| Text Summarization | arXiv Summarization Dataset | ROUGE-2 | 20.8 | PRIMER |
| Text Summarization | arXiv Summarization Dataset | ROUGE-L | 42.6 | PRIMER |
| Text Summarization | Multi-News | ROUGE-1 | 49.9 | PRIMER |
| Text Summarization | Multi-News | ROUGE-2 | 21.1 | PRIMER |
| Text Summarization | Multi-News | ROUGE-L | 25.9 | PRIMER |
| Text Summarization | WCEP | ROUGE-1 | 46.1 | PRIMER |
| Text Summarization | WCEP | ROUGE-2 | 25.2 | PRIMER |
| Text Summarization | WCEP | ROUGE-L | 37.9 | PRIMER |