CodeInstruct
InstructCoder, CodeInstruct
TextsIntroduced 2023-05-23
InstructCoder is the first dataset designed to adapt LLMs for general code editing. It consists of over 100k instruction-input-output triplets and covers multiple distinct code editing scenarios, generated by ChatGPT. LLaMA-33B finetuned on InstructCoder performs on par with ChatGPT on a real-world test set derived from GitHub commits.