TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ProphetNet: Predicting Future N-gram for Sequence-to-Seque...

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training

Weizhen Qi, Yu Yan, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou

2020-01-13Abstractive Text SummarizationText SummarizationPredictionQuestion Generation
PaperPDFCodeCode(official)CodeCodeCode

Abstract

This paper presents a new sequence-to-sequence pre-training model called ProphetNet, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism. Instead of optimizing one-step-ahead prediction in the traditional sequence-to-sequence model, the ProphetNet is optimized by n-step ahead prediction that predicts the next n tokens simultaneously based on previous context tokens at each time step. The future n-gram prediction explicitly encourages the model to plan for the future tokens and prevent overfitting on strong local correlations. We pre-train ProphetNet using a base scale dataset (16GB) and a large-scale dataset (160GB), respectively. Then we conduct experiments on CNN/DailyMail, Gigaword, and SQuAD 1.1 benchmarks for abstractive summarization and question generation tasks. Experimental results show that ProphetNet achieves new state-of-the-art results on all these datasets compared to the models using the same scale pre-training corpus.

Results

TaskDatasetMetricValueModel
Text SummarizationGigaWordROUGE-139.51ProphetNet
Text SummarizationGigaWordROUGE-220.42ProphetNet
Text SummarizationGigaWordROUGE-L36.69ProphetNet
Text SummarizationCNN / Daily MailROUGE-144.2ProphetNet
Text SummarizationCNN / Daily MailROUGE-221.17ProphetNet
Text SummarizationCNN / Daily MailROUGE-L41.3ProphetNet
Abstractive Text SummarizationCNN / Daily MailROUGE-144.2ProphetNet
Abstractive Text SummarizationCNN / Daily MailROUGE-221.17ProphetNet
Abstractive Text SummarizationCNN / Daily MailROUGE-L41.3ProphetNet
Question GenerationSQuAD1.1BLEU-423.91ProphetNet
Question GenerationSQuAD1.1METEOR26.6ProphetNet
Question GenerationSQuAD1.1ROUGE-L52.3ProphetNet

Related Papers

Multi-Strategy Improved Snake Optimizer Accelerated CNN-LSTM-Attention-Adaboost for Trajectory Prediction2025-07-21LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15Generative Click-through Rate Prediction with Applications to Search Advertising2025-07-15Conformation-Aware Structure Prediction of Antigen-Recognizing Immune Proteins2025-07-11Foundation models for time series forecasting: Application in conformal prediction2025-07-09Predicting Graph Structure via Adapted Flux Balance Analysis2025-07-08Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis2025-07-08A Wireless Foundation Model for Multi-Task Prediction2025-07-08