DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Daya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen, Xiao Bi, Y. Wu, Y. K. Li, Fuli Luo, Yingfei Xiong, Wenfeng Liang

2024-01-25Large Language Model Code Generation Language Modelling

Paper PDF Code(official)

Abstract

The rapid development of large language models has revolutionized code intelligence in software development. However, the predominance of closed-source models has restricted extensive research and development. To address this, we introduce the DeepSeek-Coder series, a range of open-source code models with sizes from 1.3B to 33B, trained from scratch on 2 trillion tokens. These models are pre-trained on a high-quality project-level code corpus and employ a fill-in-the-blank task with a 16K window to enhance code generation and infilling. Our extensive evaluations demonstrate that DeepSeek-Coder not only achieves state-of-the-art performance among open-source code models across multiple benchmarks but also surpasses existing closed-source models like Codex and GPT-3.5. Furthermore, DeepSeek-Coder models are under a permissive license that allows for both research and unrestricted commercial use.

Results

Task	Dataset	Metric	Value	Model
Code Generation	APPS	Competition Pass@1	11.09	deepseek-ai/deepseek-coder-6.7b-instruct
Code Generation	APPS	Interview Pass@1	19.7	deepseek-ai/deepseek-coder-6.7b-instruct
Code Generation	APPS	Introductory Pass@1	33.8	deepseek-ai/deepseek-coder-6.7b-instruct
Code Generation	MBPP	Accuracy	80	GPT-4 (few-shot)
Code Generation	MBPP	Accuracy	70.8	GPT-3.5 Turbo (few-shot)
Code Generation	MBPP	Accuracy	70	DeepSeek-Coder-Instruct 33B (few-shot)
Code Generation	MBPP	Accuracy	66	DeepSeek-Coder-Base 33B (few-shot)
Code Generation	MBPP	Accuracy	65.4	DeepSeek-Coder-Instruct 6.7B (few-shot)
Code Generation	MBPP	Accuracy	60.6	DeepSeek-Coder-Base 6.7B (few-shot)
Code Generation	MBPP	Accuracy	49.4	DeepSeek-Coder-Instruct 1.3B (few-shot)
Code Generation	MBPP	Accuracy	46.2	DeepSeek-Coder-Base 1.3B (few-shot)

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21 DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits2025-07-18 CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18 GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM2025-07-17 The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17 Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17 Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities2025-07-17 Towards Formal Verification of LLM-Generated Code from Natural Language Prompts2025-07-17