TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Dataset for Automatic Summarization of Russian News

Dataset for Automatic Summarization of Russian News

Ilya Gusev

2020-06-19Text Summarization
PaperPDFCode(official)Code(official)

Abstract

Automatic text summarization has been studied in a variety of domains and languages. However, this does not hold for the Russian language. To overcome this issue, we present Gazeta, the first dataset for summarization of Russian news. We describe the properties of this dataset and benchmark several extractive and abstractive models. We demonstrate that the dataset is a valid task for methods of text summarization for Russian. Additionally, we prove the pretrained mBART model to be useful for Russian text summarization.

Results

TaskDatasetMetricValueModel
Text SummarizationGazetaBLEU12.4Finetuned mBART
Text SummarizationGazetaMeteor25.7Finetuned mBART
Text SummarizationGazetaROUGE-132.1Finetuned mBART
Text SummarizationGazetaROUGE-214.2Finetuned mBART
Text SummarizationGazetaROUGE-L27.9Finetuned mBART

Related Papers

LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention2025-06-11Improving large language models with concept-aware fine-tuning2025-06-09MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection2025-05-29APE: A Data-Centric Benchmark for Efficient LLM Adaptation in Text Summarization2025-05-26FiLLM -- A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM)2025-05-25Scaling Up Biomedical Vision-Language Models: Fine-Tuning, Instruction Tuning, and Multi-Modal Learning2025-05-23A Structured Literature Review on Traditional Approaches in Current Natural Language Processing2025-05-19