SAFIM
Syntax-Aware Fill-In-the-Middle
TextsCC-BY-4.0Introduced 2024-03-07
Syntax-Aware Fill-in-the-Middle (SAFIM) is a benchmark for evaluating Large Language Models (LLMs) on the code Fill-in-the-Middle (FIM) task. SAFIM has three subtasks: Algorithmic Block Completion, Control-Flow Expression Completion, and API Function Call Completion. SAFIM is sourced from code submitted from April 2022 to January 2023 to minimize the impact of data contamination on evaluation results.
- Authors: Linyuan Gong, Sida Wang, Mostafa Elhoushi, Alvin Cheung
- Paper: https://arxiv.org/abs/2403.04814
- Huggingface Dataset: https://huggingface.co/datasets/gonglinyuan/safim
- Leaderboard: https://safimbenchmark.com
- Code & Submission Instructions: https://github.com/gonglinyuan/safim
The SAFIM benchmark is partially derived from problem descriptions and code solutions from https://codeforces.com. According to the license of CodeForces, you may publish the texts of Codeforces problems in any open sources, but you must preserve a direct link to the site.