CodeGen

Natural Language ProcessingIntroduced 200025 papers

Description

CodeGen is an autoregressive transformers with next-token prediction language modeling as the learning objective trained on a natural language corpus and programming language data curated from GitHub.

Papers Using This Method

How to Select Pre-Trained Code Models for Reuse? A Learning Perspective2025-01-07Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach2024-04-22CodeFort: Robust Training for Code Generation Models2024-04-11Bugs in Large Language Models Generated Code: An Empirical Study2024-03-13LLaMoCo: Instruction Tuning of Large Language Models for Optimization Code Generation2024-03-02HumanEval on Latest GPT Models -- 20242024-02-20CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion2023-10-17Functional Overlap Reranking for Neural Code Generation2023-10-16Large Language Model-Aware In-Context Learning for Code Generation2023-10-15BioCoder: A Benchmark for Bioinformatics Code Generation with Large Language Models2023-08-31COMEX: A Tool for Generating Customized Source Code Representations2023-07-10Exploring the Robustness of Large Language Models for Solving Programming Problems2023-06-26Fine-Tuning Large Language Models for Answering Programming Questions with Code Snippets2023-06-26How Effective Are Neural Networks for Fixing Security Vulnerabilities2023-05-29Prompting with Pseudo-Code Instructions2023-05-19Using Large Language Models to Generate JUnit Tests: An Empirical Study2023-04-30Learning Performance-Improving Code Edits2023-02-15Large Language Models for Code: Security Hardening and Adversarial Testing2023-02-10Execution-Based Evaluation for Open-Domain Code Generation2022-12-20ReCode: Robustness Evaluation of Code Generation Models2022-12-20