TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/DebateSum: A large-scale argument mining and summarization...

DebateSum: A large-scale argument mining and summarization dataset

Allen Roush, Arvind Balaji

2020-11-14COLING (ArgMining) 2020 12Abstractive Text SummarizationExtractive Text SummarizationText SummarizationDocument SummarizationInformation RetrievalQuery-Based Extractive SummarizationArgument Mining
PaperPDFCode(official)Code(official)Code(official)

Abstract

Prior work in Argument Mining frequently alludes to its potential applications in automatic debating systems. Despite this focus, almost no datasets or models exist which apply natural language processing techniques to problems found within competitive formal debate. To remedy this, we present the DebateSum dataset. DebateSum consists of 187,386 unique pieces of evidence with corresponding argument and extractive summaries. DebateSum was made using data compiled by competitors within the National Speech and Debate Association over a 7-year period. We train several transformer summarization models to benchmark summarization performance on DebateSum. We also introduce a set of fasttext word-vectors trained on DebateSum called debate2vec. Finally, we present a search engine for this dataset which is utilized extensively by members of the National Speech and Debate Association today. The DebateSum search engine is available to the public here: http://www.debate.cards

Results

TaskDatasetMetricValueModel
Text SummarizationDebateSumROUGE-L57.21Longformer-Base
Text SummarizationDebateSumROUGE-L53.23GPT2-Medium
Text SummarizationDebateSumROUGE-L49.98BERT-Large
Extractive Text SummarizationDebateSumROUGE-L57.21Longformer-Base
Extractive Text SummarizationDebateSumROUGE-L53.23GPT2-Medium
Extractive Text SummarizationDebateSumROUGE-L49.98BERT-Large

Related Papers

Leveraging Context for Multimodal Fallacy Classification in Political Debates2025-07-21Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15From Chaos to Automation: Enabling the Use of Unstructured Data for Robotic Process Automation2025-07-15Temporal Information Retrieval via Time-Specifier Model Merging2025-07-09Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers2025-07-08An analysis of vision-language models for fabric retrieval2025-07-07Graph Collaborative Attention Network for Link Prediction in Knowledge Graphs2025-07-05