MentSum: A Resource for Exploring Summarization of Mental Health Online Posts

Sajad Sotudeh, Nazli Goharian, Zachary Young

2022-06-02LREC 2022 6Text Summarization

Abstract

Mental health remains a significant challenge of public health worldwide. With increasing popularity of online platforms, many use the platforms to share their mental health conditions, express their feelings, and seek help from the community and counselors. Some of these platforms, such as Reachout, are dedicated forums where the users register to seek help. Others such as Reddit provide subreddits where the users publicly but anonymously post their mental health distress. Although posts are of varying length, it is beneficial to provide a short, but informative summary for fast processing by the counselors. To facilitate research in summarization of mental health online posts, we introduce Mental Health Summarization dataset, MentSum, containing over 24k carefully selected user posts from Reddit, along with their short user-written summary (called TLDR) in English from 43 mental health subreddits. This domain-specific dataset could be of interest not only for generating short summaries on Reddit, but also for generating summaries of posts on the dedicated mental health forums such as Reachout. We further evaluate both extractive and abstractive state-of-the-art summarization baselines in terms of Rouge scores, and finally conduct an in-depth human evaluation study of both user-written and system-generated summaries, highlighting challenges in this research.

Results

Task	Dataset	Metric	Value	Model
Text Summarization	MentSum	Rouge-1	29.13	BART
Text Summarization	MentSum	Rouge-2	7.98	BART
Text Summarization	MentSum	Rouge-L	20.27	BART

Related Papers

LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15 On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention2025-06-11 Improving large language models with concept-aware fine-tuning2025-06-09 MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection2025-05-29 APE: A Data-Centric Benchmark for Efficient LLM Adaptation in Text Summarization2025-05-26 FiLLM -- A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM)2025-05-25 Scaling Up Biomedical Vision-Language Models: Fine-Tuning, Instruction Tuning, and Multi-Modal Learning2025-05-23 A Structured Literature Review on Traditional Approaches in Current Natural Language Processing2025-05-19