Papers With Code 2 | ML Benchmarks, SotA Results & Code

IllusionMNIST_test

Dataset Characteristics

IllusionMNIST_test is a generated dataset derived from the MNIST dataset. It introduces a novel element of pareidolia—a phenomenon where patterns, often faces, are perceived in random or abstract stimuli. The dataset contains 11 classes: the original 10 digits from MNIST, and an additional "No Illusion" class. It includes 1,219 samples, all synthetically created rather than real-world images.

Motivations and Content Summary

The dataset was created using ControlNet for image generation, with captions produced by four large language models (LLMs). The goal is to introduce visual illusions and pareidolia into an otherwise familiar domain, encouraging models to handle the nuanced challenge of reasoning about visual illusions. This expands upon the MNIST dataset by blending the simplicity of handwritten digits with the complexity of human perception.

Potential Use Cases

Illusory VQA: Questioning models about the illusions present in the images.
Multimodal Model Evaluation: Benchmarking multimodal models' ability to interpret and reason about abstract and illusory patterns.
Perceptual Studies: Understanding how AI models perceive and classify pareidolia, a phenomenon deeply tied to human visual perception.
Synthetic Data Research: Exploring the use of generated datasets to introduce unconventional challenges to machine learning models.