XYSquares

ImagesMITIntroduced 2022-01-01

Synthetic dataset intended for benchmarking disentanglement frameworks.

XYSquares is adversarial in nature; the distance between any two observations in the dataset is constant when measured using a pixel-wise distance function. It is usually impossible for VAE frameworks that use pixel-wise losses to disentangle this dataset.

The dataset is constructed from 3 non-overlapping red, green and blue squares that are each 8×88\times8 pixels in size. Each square can move along the xx and yy coordinates of an 8×88\times8 grid. The resulting images are 64×6464\times64 pixels in size. With this construction the dataset has a total of 8 ground-truth factors for a total of 86=2621448^6 = 262144 observations.