Vulnerability Java Dataset

TextsIntroduced 2024-03-01

The dataset consists of two versions: X1X_1 with P3P_3 and X1X_1 without P3P_3, where P3P_3 represents a set of random unchanged functions from vulnerability fixing commits. This dataset is designed for finetuning large language models to detect vulnerabilities in code. It can be used for training and evaluating models in automated vulnerability detection tasks.

Source: Finetuning Large Language Models for Vulnerability Detection