RES-Q
RES-Q: Evaluating Code-Editing Large Language Model Systems at the Repository Scale
TextsMITIntroduced 2024-06-24
RES-Q is a natural language instruction-based benchmark for evaluating epository diting ystems, which consists of 100 handcrafted repository editing tasks derived from real GitHub commits. Given an edit instruction and a code repository, RES-Q evaluates an LLM system’s ability to interpret edit instructions, gather information, and construct appropriate edits to the repository.