Qwen2 Technical Report

An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang, Keming Lu, Keqin Chen, Kexin Yang, Mei Li, Mingfeng Xue, Na Ni, Pei Zhang, Peng Wang, Ru Peng, Rui Men, Ruize Gao, Runji Lin, Shijie Wang, Shuai Bai, Sinan Tan, Tianhang Zhu, TianHao Li, Tianyu Liu, Wenbin Ge, Xiaodong Deng, Xiaohuan Zhou, Xingzhang Ren, Xinyu Zhang, Xipin Wei, Xuancheng Ren, Xuejing Liu, Yang Fan, Yang Yao, Yichang Zhang, Yu Wan, Yunfei Chu, Yuqiong Liu, Zeyu Cui, Zhenru Zhang, Zhifang Guo, Zhihao Fan

2024-07-15Math Word Problem Solving Quantization GSM8K Arithmetic Reasoning MMLU Language Modelling HumanEval

Paper PDF Code(official)Code Code Code Code Code

Abstract

This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, and exhibits competitive performance relative to proprietary models across diverse benchmarks on language understanding, generation, multilingual proficiency, coding, mathematics, and reasoning. The flagship model, Qwen2-72B, showcases remarkable performance: 84.2 on MMLU, 37.9 on GPQA, 64.6 on HumanEval, 89.5 on GSM8K, and 82.4 on BBH as a base language model. The instruction-tuned variant, Qwen2-72B-Instruct, attains 9.1 on MT-Bench, 48.1 on Arena-Hard, and 35.7 on LiveCodeBench. Moreover, Qwen2 demonstrates robust multilingual capabilities, proficient in approximately 30 languages, spanning English, Chinese, Spanish, French, German, Arabic, Russian, Korean, Japanese, Thai, Vietnamese, and more, underscoring its versatility and global reach. To foster community innovation and accessibility, we have made the Qwen2 model weights openly available on Hugging Face and ModelScope, and the supplementary materials including example code on GitHub. These platforms also include resources for quantization, fine-tuning, and deployment, facilitating a wide range of applications and research endeavors.

Results

Task	Dataset	Metric	Value	Model
Question Answering	MATH	Accuracy	84	Qwen2-Math-72B-Instruct(greedy)
Question Answering	MATH	Parameters (Billions)	72	Qwen2-Math-72B-Instruct(greedy)
Math Word Problem Solving	MATH	Accuracy	84	Qwen2-Math-72B-Instruct(greedy)
Math Word Problem Solving	MATH	Parameters (Billions)	72	Qwen2-Math-72B-Instruct(greedy)
Mathematical Question Answering	MATH	Accuracy	84	Qwen2-Math-72B-Instruct(greedy)
Mathematical Question Answering	MATH	Parameters (Billions)	72	Qwen2-Math-72B-Instruct(greedy)
Mathematical Reasoning	MATH	Accuracy	84	Qwen2-Math-72B-Instruct(greedy)
Mathematical Reasoning	MATH	Parameters (Billions)	72	Qwen2-Math-72B-Instruct(greedy)
Arithmetic Reasoning	GSM8K	Accuracy	96.7	Qwen2-Math-72B-Instruct (greedy)
Arithmetic Reasoning	GSM8K	Parameters (Billion)	72	Qwen2-Math-72B-Instruct (greedy)

Qwen2 Technical Report

Abstract

Results

Task	Dataset	Metric	Value	Model
Question Answering	MATH	Accuracy	84	Qwen2-Math-72B-Instruct(greedy)
Question Answering	MATH	Parameters (Billions)	72	Qwen2-Math-72B-Instruct(greedy)
Math Word Problem Solving	MATH	Accuracy	84	Qwen2-Math-72B-Instruct(greedy)
Math Word Problem Solving	MATH	Parameters (Billions)	72	Qwen2-Math-72B-Instruct(greedy)
Mathematical Question Answering	MATH	Accuracy	84	Qwen2-Math-72B-Instruct(greedy)
Mathematical Question Answering	MATH	Parameters (Billions)	72	Qwen2-Math-72B-Instruct(greedy)
Mathematical Reasoning	MATH	Accuracy	84	Qwen2-Math-72B-Instruct(greedy)
Mathematical Reasoning	MATH	Parameters (Billions)	72	Qwen2-Math-72B-Instruct(greedy)
Arithmetic Reasoning	GSM8K	Accuracy	96.7	Qwen2-Math-72B-Instruct (greedy)
Arithmetic Reasoning	GSM8K	Parameters (Billion)	72	Qwen2-Math-72B-Instruct (greedy)

Qwen2 Technical Report

Abstract

Results

Related Papers

Qwen2 Technical Report

Abstract

Results

Related Papers