BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Nguyen Luong Tran, Duong Minh Le, Dat Quoc Nguyen

2021-09-20Denoising Punctuation Restoration Abstractive Text Summarization Text Summarization

Abstract

We present BARTpho with two versions, BARTpho-syllable and BARTpho-word, which are the first public large-scale monolingual sequence-to-sequence models pre-trained for Vietnamese. BARTpho uses the "large" architecture and the pre-training scheme of the sequence-to-sequence denoising autoencoder BART, thus it is especially suitable for generative NLP tasks. We conduct experiments to compare our BARTpho with its competitor mBART on a downstream task of Vietnamese text summarization and show that: in both automatic and human evaluations, BARTpho outperforms the strong baseline mBART and improves the state-of-the-art. We further evaluate and compare BARTpho and mBART on the Vietnamese capitalization and punctuation restoration tasks and also find that BARTpho is more effective than mBART on these two tasks. We publicly release BARTpho to facilitate future research and applications of generative Vietnamese NLP tasks. Our BARTpho models are available at https://github.com/VinAIResearch/BARTpho

Results

Task	Dataset	Metric	Value	Model
Text Summarization	vietnews	Rouge-1	61.14	BARTpho
Text Summarization	vietnews	Rouge-2	30.31	BARTpho
Text Summarization	vietnews	Rouge-L	40.15	BARTpho
Abstractive Text Summarization	vietnews	Rouge-1	61.14	BARTpho
Abstractive Text Summarization	vietnews	Rouge-2	30.31	BARTpho
Abstractive Text Summarization	vietnews	Rouge-L	40.15	BARTpho

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17 Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17 Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16 HUG-VAS: A Hierarchical NURBS-Based Generative Model for Aortic Geometry Synthesis and Controllable Editing2025-07-15 AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air2025-07-15 LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification2025-07-15 A statistical physics framework for optimal learning2025-07-10 LangMamba: A Language-driven Mamba Framework for Low-dose CT Denoising with Vision-language Models2025-07-08