ImageNet-D
Introduced 2024-03-27
ImageNet-D contains 4835 test images featuring diverse backgrounds (3,764), textures (498), and materials (573). Generated by diffusion models, ImageNet-D achieves superior image fidelity and collection efficiency than prior studies. Evaluation results show that ImageNet-D results in a significant accuracy drop to a range of vision models, from the standard ResNet visual classifier to the latest foundation models like CLIP and MiniGPT-4, significantly reducing their accuracy by up to 60%.