TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Break-It-Fix-It: Unsupervised Learning for Program Repair

Break-It-Fix-It: Unsupervised Learning for Program Repair

Michihiro Yasunaga, Percy Liang

2021-06-11Style TransferProgram RepairUnsupervised Machine TranslationData AugmentationCode RepairDomain Adaptation
PaperPDFCode(official)

Abstract

We consider repair tasks: given a critic (e.g., compiler) that assesses the quality of an input, the goal is to train a fixer that converts a bad example (e.g., code with syntax errors) into a good one (e.g., code with no syntax errors). Existing works create training data consisting of (bad, good) pairs by corrupting good examples using heuristics (e.g., dropping tokens). However, fixers trained on this synthetically-generated data do not extrapolate well to the real distribution of bad inputs. To bridge this gap, we propose a new training approach, Break-It-Fix-It (BIFI), which has two key ideas: (i) we use the critic to check a fixer's output on real bad inputs and add good (fixed) outputs to the training data, and (ii) we train a breaker to generate realistic bad code from good code. Based on these ideas, we iteratively update the breaker and the fixer while using them in conjunction to generate more paired data. We evaluate BIFI on two code repair datasets: GitHub-Python, a new dataset we introduce where the goal is to repair Python code with AST parse errors; and DeepFix, where the goal is to repair C code with compiler errors. BIFI outperforms existing methods, obtaining 90.5% repair accuracy on GitHub-Python (+28.5%) and 71.7% on DeepFix (+5.6%). Notably, BIFI does not require any labeled data; we hope it will be a strong starting point for unsupervised learning of various repair tasks.

Results

TaskDatasetMetricValueModel
Program SynthesisGitHub-PythonAccuracy (%)90.5Transformer + BIFI
Program SynthesisGitHub-PythonAccuracy (%)62Transformer
Program SynthesisDeepFixAverage Success Rate71.7DrRepair + BIFI
Program RepairGitHub-PythonAccuracy (%)90.5Transformer + BIFI
Program RepairGitHub-PythonAccuracy (%)62Transformer
Program RepairDeepFixAverage Success Rate71.7DrRepair + BIFI

Related Papers

Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16Data Augmentation in Time Series Forecasting through Inverted Framework2025-07-15Transferring Styles for Reduced Texture Bias and Improved Robustness in Semantic Segmentation Networks2025-07-14Iceberg: Enhancing HLS Modeling with Synthetic Data2025-07-14CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks2025-07-14