TextBox 2.0: A Text Generation Library with Pre-trained Language Models

Tianyi Tang, Junyi Li, Zhipeng Chen, Yiwen Hu, Zhuohao Yu, Wenxun Dai, Zican Dong, Xiaoxue Cheng, Yuhao Wang, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

2022-12-26Machine Translation Question Answering Data-to-Text Generation Text Generation Style Transfer Abstractive Text Summarization Story Generation Task-Oriented Dialogue Systems Question Generation Text Simplification

Paper PDF Code(official)

Abstract

To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets and further incorporates $45$ PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement $4$ efficient training strategies and provide $4$ generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.

Results

Task	Dataset	Metric	Value	Model
Sketch	GYAFC	Accuracy	94.37	BART (TextBox 2.0)
Sketch	GYAFC	BLEU-4	76.93	BART (TextBox 2.0)
Sketch	GYAFC	Harmonic mean	84.74	BART (TextBox 2.0)
Dialogue	Persona-Chat	BLEU-1	49.581	BART (TextBox 2.0)
Dialogue	Persona-Chat	BLEU-2	39.24	BART (TextBox 2.0)
Dialogue	Persona-Chat	Distinct-1	1.44	BART (TextBox 2.0)
Dialogue	Persona-Chat	Distinct-2	8.89	BART (TextBox 2.0)
Dialogue	MULTIWOZ 2.0	BLEU-4	20.17	BART (TextBox 2.0)
Dialogue	MULTIWOZ 2.0	Score	100.07	BART (TextBox 2.0)
Machine Translation	WMT2016 Romanian-English	BLEU-4	37.48	BART (TextBox 2.0)
Machine Translation	WMT2016 English-Romanian	BLEU-4	37.2	BART (TextBox 2.0)
Style Transfer	GYAFC	Accuracy	94.37	BART (TextBox 2.0)
Style Transfer	GYAFC	BLEU-4	76.93	BART (TextBox 2.0)
Style Transfer	GYAFC	Harmonic mean	84.74	BART (TextBox 2.0)
Question Answering	SQuAD1.1	Exact Match	86.44	BART (TextBox 2.0)
Question Answering	SQuAD1.1	F1	93.04	BART (TextBox 2.0)
Text Generation	ADGEN	BLEU-4	10.2	BART (TextBox 2.0)
Text Generation	CSL	ROUGE-L	64.34	BART (TextBox 2.0)
Text Generation	LCSTS	ROUGE-L	42.96	BART (TextBox 2.0)
Text Generation	CommonGen	BLEU-4	28.18	BART (TextBox 2.0)
Text Generation	CommonGen	CIDEr	12.98	BART (TextBox 2.0)
Text Generation	CommonGen	SPICE	33	BART (TextBox 2.0)
Text Generation	WebNLG	BLEU-4	67.33	BART (TextBox 2.0)
Text Generation	WebNLG	METEOR	47.78	BART (TextBox 2.0)
Text Generation	WebNLG	ROUGE-L	76.83	BART (TextBox 2.0)
Text Generation	WritingPrompts	BLEU-1	33.79	BART (TextBox 2.0)
Text Generation	WritingPrompts	BLEU-2	15.78	BART (TextBox 2.0)
Text Generation	WritingPrompts	Distinct-4	78.762	BART (TextBox 2.0)
Text Simplification	Wiki-Auto + Turk	BLEU-4	90.81	BART (TextBox 2.0)
Text Simplification	Wiki-Auto + Turk	METEOR	57.58	BART (TextBox 2.0)
Text Simplification	Wiki-Auto + Turk	ROUGE-2	83.36	BART (TextBox 2.0)
Text Summarization	CNN/Daily Mail	ROUGE-1	44.47	BART (TextBox 2.0)
Text Summarization	CNN/Daily Mail	ROUGE-2	21.5	BART (TextBox 2.0)
Text Summarization	CNN/Daily Mail	ROUGE-L	41.35	BART (TextBox 2.0)
Abstractive Text Summarization	CNN/Daily Mail	ROUGE-1	44.47	BART (TextBox 2.0)
Abstractive Text Summarization	CNN/Daily Mail	ROUGE-2	21.5	BART (TextBox 2.0)
Abstractive Text Summarization	CNN/Daily Mail	ROUGE-L	41.35	BART (TextBox 2.0)
Data-to-Text Generation	WebNLG	BLEU-4	67.33	BART (TextBox 2.0)
Data-to-Text Generation	WebNLG	METEOR	47.78	BART (TextBox 2.0)
Data-to-Text Generation	WebNLG	ROUGE-L	76.83	BART (TextBox 2.0)
Question Generation	SQuAD1.1	BLEU-4	25.08	BART (TextBox 2.0)
Question Generation	SQuAD1.1	METEOR	26.73	BART (TextBox 2.0)
Question Generation	SQuAD1.1	ROUGE-L	52.55	BART (TextBox 2.0)
2D Human Pose Estimation	GYAFC	Accuracy	94.37	BART (TextBox 2.0)
2D Human Pose Estimation	GYAFC	BLEU-4	76.93	BART (TextBox 2.0)
2D Human Pose Estimation	GYAFC	Harmonic mean	84.74	BART (TextBox 2.0)
2D Classification	GYAFC	Accuracy	94.37	BART (TextBox 2.0)
2D Classification	GYAFC	BLEU-4	76.93	BART (TextBox 2.0)
2D Classification	GYAFC	Harmonic mean	84.74	BART (TextBox 2.0)
Task-Oriented Dialogue Systems	MULTIWOZ 2.0	BLEU-4	20.17	BART (TextBox 2.0)
Task-Oriented Dialogue Systems	MULTIWOZ 2.0	Score	100.07	BART (TextBox 2.0)
Story Generation	WritingPrompts	BLEU-1	33.79	BART (TextBox 2.0)
Story Generation	WritingPrompts	BLEU-2	15.78	BART (TextBox 2.0)
Story Generation	WritingPrompts	Distinct-4	78.762	BART (TextBox 2.0)
1 Image, 2*2 Stitchi	GYAFC	Accuracy	94.37	BART (TextBox 2.0)
1 Image, 2*2 Stitchi	GYAFC	BLEU-4	76.93	BART (TextBox 2.0)
1 Image, 2*2 Stitchi	GYAFC	Harmonic mean	84.74	BART (TextBox 2.0)
Drawing Pictures	GYAFC	Accuracy	94.37	BART (TextBox 2.0)
Drawing Pictures	GYAFC	BLEU-4	76.93	BART (TextBox 2.0)
Drawing Pictures	GYAFC	Harmonic mean	84.74	BART (TextBox 2.0)

Abstract

Results

Task	Dataset	Metric	Value	Model
Sketch	GYAFC	Accuracy	94.37	BART (TextBox 2.0)
Sketch	GYAFC	BLEU-4	76.93	BART (TextBox 2.0)
Sketch	GYAFC	Harmonic mean	84.74	BART (TextBox 2.0)
Dialogue	Persona-Chat	BLEU-1	49.581	BART (TextBox 2.0)
Dialogue	Persona-Chat	BLEU-2	39.24	BART (TextBox 2.0)
Dialogue	Persona-Chat	Distinct-1	1.44	BART (TextBox 2.0)
Dialogue	Persona-Chat	Distinct-2	8.89	BART (TextBox 2.0)
Dialogue	MULTIWOZ 2.0	BLEU-4	20.17	BART (TextBox 2.0)
Dialogue	MULTIWOZ 2.0	Score	100.07	BART (TextBox 2.0)
Machine Translation	WMT2016 Romanian-English	BLEU-4	37.48	BART (TextBox 2.0)
Machine Translation	WMT2016 English-Romanian	BLEU-4	37.2	BART (TextBox 2.0)
Style Transfer	GYAFC	Accuracy	94.37	BART (TextBox 2.0)
Style Transfer	GYAFC	BLEU-4	76.93	BART (TextBox 2.0)
Style Transfer	GYAFC	Harmonic mean	84.74	BART (TextBox 2.0)
Question Answering	SQuAD1.1	Exact Match	86.44	BART (TextBox 2.0)
Question Answering	SQuAD1.1	F1	93.04	BART (TextBox 2.0)
Text Generation	ADGEN	BLEU-4	10.2	BART (TextBox 2.0)
Text Generation	CSL	ROUGE-L	64.34	BART (TextBox 2.0)
Text Generation	LCSTS	ROUGE-L	42.96	BART (TextBox 2.0)
Text Generation	CommonGen	BLEU-4	28.18	BART (TextBox 2.0)
Text Generation	CommonGen	CIDEr	12.98	BART (TextBox 2.0)
Text Generation	CommonGen	SPICE	33	BART (TextBox 2.0)
Text Generation	WebNLG	BLEU-4	67.33	BART (TextBox 2.0)
Text Generation	WebNLG	METEOR	47.78	BART (TextBox 2.0)
Text Generation	WebNLG	ROUGE-L	76.83	BART (TextBox 2.0)
Text Generation	WritingPrompts	BLEU-1	33.79	BART (TextBox 2.0)
Text Generation	WritingPrompts	BLEU-2	15.78	BART (TextBox 2.0)
Text Generation	WritingPrompts	Distinct-4	78.762	BART (TextBox 2.0)
Text Simplification	Wiki-Auto + Turk	BLEU-4	90.81	BART (TextBox 2.0)
Text Simplification	Wiki-Auto + Turk	METEOR	57.58	BART (TextBox 2.0)
Text Simplification	Wiki-Auto + Turk	ROUGE-2	83.36	BART (TextBox 2.0)
Text Summarization	CNN/Daily Mail	ROUGE-1	44.47	BART (TextBox 2.0)
Text Summarization	CNN/Daily Mail	ROUGE-2	21.5	BART (TextBox 2.0)
Text Summarization	CNN/Daily Mail	ROUGE-L	41.35	BART (TextBox 2.0)
Abstractive Text Summarization	CNN/Daily Mail	ROUGE-1	44.47	BART (TextBox 2.0)
Abstractive Text Summarization	CNN/Daily Mail	ROUGE-2	21.5	BART (TextBox 2.0)
Abstractive Text Summarization	CNN/Daily Mail	ROUGE-L	41.35	BART (TextBox 2.0)
Data-to-Text Generation	WebNLG	BLEU-4	67.33	BART (TextBox 2.0)
Data-to-Text Generation	WebNLG	METEOR	47.78	BART (TextBox 2.0)
Data-to-Text Generation	WebNLG	ROUGE-L	76.83	BART (TextBox 2.0)
Question Generation	SQuAD1.1	BLEU-4	25.08	BART (TextBox 2.0)
Question Generation	SQuAD1.1	METEOR	26.73	BART (TextBox 2.0)
Question Generation	SQuAD1.1	ROUGE-L	52.55	BART (TextBox 2.0)
2D Human Pose Estimation	GYAFC	Accuracy	94.37	BART (TextBox 2.0)
2D Human Pose Estimation	GYAFC	BLEU-4	76.93	BART (TextBox 2.0)
2D Human Pose Estimation	GYAFC	Harmonic mean	84.74	BART (TextBox 2.0)
2D Classification	GYAFC	Accuracy	94.37	BART (TextBox 2.0)
2D Classification	GYAFC	BLEU-4	76.93	BART (TextBox 2.0)
2D Classification	GYAFC	Harmonic mean	84.74	BART (TextBox 2.0)
Task-Oriented Dialogue Systems	MULTIWOZ 2.0	BLEU-4	20.17	BART (TextBox 2.0)
Task-Oriented Dialogue Systems	MULTIWOZ 2.0	Score	100.07	BART (TextBox 2.0)
Story Generation	WritingPrompts	BLEU-1	33.79	BART (TextBox 2.0)
Story Generation	WritingPrompts	BLEU-2	15.78	BART (TextBox 2.0)
Story Generation	WritingPrompts	Distinct-4	78.762	BART (TextBox 2.0)
1 Image, 2*2 Stitchi	GYAFC	Accuracy	94.37	BART (TextBox 2.0)
1 Image, 2*2 Stitchi	GYAFC	BLEU-4	76.93	BART (TextBox 2.0)
1 Image, 2*2 Stitchi	GYAFC	Harmonic mean	84.74	BART (TextBox 2.0)
Drawing Pictures	GYAFC	Accuracy	94.37	BART (TextBox 2.0)
Drawing Pictures	GYAFC	BLEU-4	76.93	BART (TextBox 2.0)
Drawing Pictures	GYAFC	Harmonic mean	84.74	BART (TextBox 2.0)

TextBox 2.0: A Text Generation Library with Pre-trained Language Models

Abstract

Results

Related Papers

TextBox 2.0: A Text Generation Library with Pre-trained Language Models

Abstract

Results

Related Papers