TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/HowSumm: A Multi-Document Summarization Dataset Derived fr...

HowSumm: A Multi-Document Summarization Dataset Derived from WikiHow Articles

Odellia Boni, Guy Feigenblat, Guy Lev, Michal Shmueli-Scheuer, Benjamin Sznajder, David Konopnicki

2021-10-07Multi-Document SummarizationAbstractive Text SummarizationDocument Summarization
PaperPDFCode(official)

Abstract

We present HowSumm, a novel large-scale dataset for the task of query-focused multi-document summarization (qMDS), which targets the use-case of generating actionable instructions from a set of sources. This use-case is different from the use-cases covered in existing multi-document summarization (MDS) datasets and is applicable to educational and industrial scenarios. We employed automatic methods, and leveraged statistics from existing human-crafted qMDS datasets, to create HowSumm from wikiHow website articles and the sources they cite. We describe the creation of the dataset and discuss the unique features that distinguish it from other summarization corpora. Automatic and human evaluations of both extractive and abstractive summarization models on the dataset reveal that there is room for improvement.

Results

TaskDatasetMetricValueModel
Text SummarizationHowSumm-MethodROUGE-153.5LexRank (query: method + article + steps titles)
Text SummarizationHowSumm-MethodROUGE-152.2CES (query: method + article + steps titles)
Text SummarizationHowSumm-MethodROUGE-148.6GreedyRel (query: method + article + steps titles)
Text SummarizationHowSumm-MethodROUGE-148.4CES (query: method title)
Text SummarizationHowSumm-MethodROUGE-148.3CES (query: method + article titles)
Text SummarizationHowSumm-MethodROUGE-147.7LexRank (query: method title)
Text SummarizationHowSumm-MethodROUGE-147.1LexRank (query: method + article titles)
Text SummarizationHowSumm-MethodROUGE-143.4GreedyRel (query: method title)
Text SummarizationHowSumm-MethodROUGE-142.3GreedyRel (query: method + article titles)
Text SummarizationHowSumm-StepROUGE-139.6LexRank (query: step title)
Text SummarizationHowSumm-StepROUGE-139.3CES (query: step title)
Text SummarizationHowSumm-StepROUGE-138.3CES (query: step + method titles)
Text SummarizationHowSumm-StepROUGE-138.2LexRank (query: step + method titles)
Text SummarizationHowSumm-StepROUGE-137CES (query: step + method + article titles)
Text SummarizationHowSumm-StepROUGE-136.3LexRank (query: step + method + article titles)
Text SummarizationHowSumm-StepROUGE-130.3GreedyRel (query: step + method titles)
Text SummarizationHowSumm-StepROUGE-130.1GreedyRel (query: step title)
Text SummarizationHowSumm-StepROUGE-123BM25-HierSumm (query: step + method titles)
Text SummarizationHowSumm-StepROUGE-122.3BM25-HierSumm (query: step title)
Text SummarizationHowSumm-StepROUGE-121.9BM25-HierSumm (query: step + method + article titles)
Document SummarizationHowSumm-MethodROUGE-153.5LexRank (query: method + article + steps titles)
Document SummarizationHowSumm-MethodROUGE-152.2CES (query: method + article + steps titles)
Document SummarizationHowSumm-MethodROUGE-148.6GreedyRel (query: method + article + steps titles)
Document SummarizationHowSumm-MethodROUGE-148.4CES (query: method title)
Document SummarizationHowSumm-MethodROUGE-148.3CES (query: method + article titles)
Document SummarizationHowSumm-MethodROUGE-147.7LexRank (query: method title)
Document SummarizationHowSumm-MethodROUGE-147.1LexRank (query: method + article titles)
Document SummarizationHowSumm-MethodROUGE-143.4GreedyRel (query: method title)
Document SummarizationHowSumm-MethodROUGE-142.3GreedyRel (query: method + article titles)
Document SummarizationHowSumm-StepROUGE-139.6LexRank (query: step title)
Document SummarizationHowSumm-StepROUGE-139.3CES (query: step title)
Document SummarizationHowSumm-StepROUGE-138.3CES (query: step + method titles)
Document SummarizationHowSumm-StepROUGE-138.2LexRank (query: step + method titles)
Document SummarizationHowSumm-StepROUGE-137CES (query: step + method + article titles)
Document SummarizationHowSumm-StepROUGE-136.3LexRank (query: step + method + article titles)
Document SummarizationHowSumm-StepROUGE-130.3GreedyRel (query: step + method titles)
Document SummarizationHowSumm-StepROUGE-130.1GreedyRel (query: step title)
Document SummarizationHowSumm-StepROUGE-123BM25-HierSumm (query: step + method titles)
Document SummarizationHowSumm-StepROUGE-122.3BM25-HierSumm (query: step title)
Document SummarizationHowSumm-StepROUGE-121.9BM25-HierSumm (query: step + method + article titles)

Related Papers

GenerationPrograms: Fine-grained Attribution with Executable Programs2025-06-17Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences2025-06-16Improving Fairness of Large Language Models in Multi-document Summarization2025-06-09Advancing Decoding Strategies: Enhancements in Locally Typical Sampling for LLMs2025-06-03ARC: Argument Representation and Coverage Analysis for Zero-Shot Long Document Summarization with Instruction Following LLMs2025-05-29Ask, Retrieve, Summarize: A Modular Pipeline for Scientific Literature Summarization2025-05-22Power-Law Decay Loss for Large Language Model Finetuning: Focusing on Information Sparsity to Enhance Generation Quality2025-05-22Hallucinate at the Last in Long Response Generation: A Case Study on Long Document Summarization2025-05-21