Individualized Deepfake Detection Dataset
The Deepfake face detection task involves a facial image of unknown authenticity for testing. While most deepfake detection methods take only the image as input, our literature demonstrates that conditioning the deepfake detector on identity—i.e., knowing whose deepfake face the picture might be—can enhance detection performance. Existing deepfake detection datasets, such as FaceForensics++ and DFDC, do not include identity information for authentic and deepfake faces. This dataset contains facial images of 45 specific individuals, divided into train and test sets, including a total of 23k authentic and 22k deepfake images. Having a specific individual's images in both the train and test sets allows us to assess detection performance for that individual. The dataset is curated so that the train and test sets are from two independent sources. The train images are curated from the CelebDFv2 dataset, and the test images are curated from the CACD dataset. Deepfake faces are generated using FaceswapGAN, utilizing a portion of the training images to train the reconstruction model. The test deepfake images are faceswapped with another identity not included in our celebrity list. On the other hand, the training deepfake images are reconstructed images of that person. The deepfake detection method proposed in our paper requires reconstructing both the training and test images. The reconstructed test and train images are also available in this dataset. It is worth mentioning that reconstructing the training deepfake images produces doubly reconstructed images.