Papers With Code 2 | ML Benchmarks, SotA Results & Code

This dataset was created to test whether it's possible to build a general-purpose detector that can tell real images apart from fake ones generated by convolutional neural networks (CNNs), no matter which model or dataset was used to create the fake images.

To do this, the authors collected fake images generated by 11 different CNN-based image generation models. These models represent a wide range of current image synthesis techniques and include:

ProGAN

StyleGAN

BigGAN

CycleGAN

StarGAN

GauGAN

DeepFakes

Cascaded Refinement Networks (CRN)

Implicit Maximum Likelihood Estimation (IMLE)

Second-order Attention Super-Resolution (SOAT-SR)

Seeing-in-the-Dark (SID)

The dataset includes fake images from each of these models and a set of real images, allowing for binary classification (real vs. fake).

The study found that a standard image classifier (like a convolutional neural network) trained on fake images from just one generator (ProGAN) was able to detect fake images from other, completely different generators with surprising accuracy. This suggests that many CNN-generated images, even from different architectures, share common flaws that can be learned and detected.

The dataset is useful for research in detecting synthetic media, improving image forensics, and understanding the weaknesses in current generative models.

Code and pre-trained models were made available by the authors (https://github.com/chuangchuangtan/GANGen-Detection).