TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Abstractive Summarization of Spoken and Written Instructio...

Abstractive Summarization of Spoken and Written Instructions with BERT

Alexandra Savelieva, Bryan Au-Yeung, Vasanth Ramani

2020-08-21Abstractive Text SummarizationTransfer LearningSentence segmentation
PaperPDFCode(official)

Abstract

Summarization of speech is a difficult problem due to the spontaneity of the flow, disfluencies, and other issues that are not usually encountered in written texts. Our work presents the first application of the BERTSum model to conversational language. We generate abstractive summaries of narrated instructional videos across a wide variety of topics, from gardening and cooking to software configuration and sports. In order to enrich the vocabulary, we use transfer learning and pretrain the model on a few large cross-domain datasets in both written and spoken English. We also do preprocessing of transcripts to restore sentence segmentation and punctuation in the output of an ASR system. The results are evaluated with ROUGE and Content-F1 scoring for the How2 and WikiHow datasets. We engage human judges to score a set of summaries randomly selected from a dataset curated from HowTo100M and YouTube. Based on blind evaluation, we achieve a level of textual fluency and utility close to that of summaries written by human content creators. The model beats current SOTA when applied to WikiHow articles that vary widely in style and topic, while showing no performance regression on the canonical CNN/DailyMail dataset. Due to the high generalizability of the model across different styles and domains, it has great potential to improve accessibility and discoverability of internet content. We envision this integrated as a feature in intelligent virtual assistants, enabling them to summarize both written and spoken instructional content upon request.

Results

TaskDatasetMetricValueModel
Text SummarizationWikiHowContent F129.8BertSum
Text SummarizationWikiHowROUGE-135.91BertSum
Text SummarizationWikiHowROUGE-213.9BertSum
Text SummarizationWikiHowROUGE-L34.82BertSum
Text SummarizationHow2Content F136.4BertSum
Text SummarizationHow2ROUGE-148.26BertSum
Text SummarizationHow2ROUGE-L44.02BertSum
Text SummarizationWikiHowContent F129.8BertSum
Text SummarizationWikiHowROUGE-135.91BertSum
Text SummarizationWikiHowROUGE-213.9BertSum
Text SummarizationWikiHowROUGE-L34.82BertSum
Abstractive Text SummarizationWikiHowContent F129.8BertSum
Abstractive Text SummarizationWikiHowROUGE-135.91BertSum
Abstractive Text SummarizationWikiHowROUGE-213.9BertSum
Abstractive Text SummarizationWikiHowROUGE-L34.82BertSum

Related Papers

RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction2025-07-18Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows2025-07-16Robust-Multi-Task Gradient Boosting2025-07-15Calibrated and Robust Foundation Models for Vision-Language and Medical Image Tasks Under Distribution Shift2025-07-12The Bayesian Approach to Continual Learning: An Overview2025-07-11Contrastive and Transfer Learning for Effective Audio Fingerprinting through a Real-World Evaluation Protocol2025-07-08Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving2025-07-08