GENESIS-V2: Inferring Unordered Object Representations without Iterative Refinement

Martin Engelcke, Oiwi Parker Jones, Ingmar Posner

2021-04-20NeurIPS 2021 12Unsupervised Image Segmentation Representation Learning Scene Generation Clustering Image Generation Unsupervised Object Segmentation Image Segmentation

Paper PDF Code Code(official)

Abstract

Advances in unsupervised learning of object-representations have culminated in the development of a broad range of methods for unsupervised object segmentation and interpretable object-centric scene generation. These methods, however, are limited to simulated and real-world datasets with limited visual complexity. Moreover, object representations are often inferred using RNNs which do not scale well to large images or iterative refinement which avoids imposing an unnatural ordering on objects in an image but requires the a priori initialisation of a fixed number of object representations. In contrast to established paradigms, this work proposes an embedding-based approach in which embeddings of pixels are clustered in a differentiable fashion using a stochastic stick-breaking process. Similar to iterative refinement, this clustering procedure also leads to randomly ordered object representations, but without the need of initialising a fixed number of clusters a priori. This is used to develop a new model, GENESIS-v2, which can infer a variable number of object representations without using RNNs or iterative refinement. We show that GENESIS-v2 performs strongly in comparison to recent baselines in terms of unsupervised image segmentation and object-centric scene generation on established synthetic datasets as well as more complex real-world datasets.

Results

Task	Dataset	Metric	Value	Model
Image Generation	ShapeStacks	FID	112.7	GENESIS-V2
Image Generation	ShapeStacks	FID	186.8	GENESIS
Image Generation	ShapeStacks	FID	197.8	MONET-G
Image Generation	ObjectsRoom	FID	52.6	GENESIS-V2
Image Generation	ObjectsRoom	FID	62.8	GENESIS
Image Generation	ObjectsRoom	FID	205.7	MONET-G
Instance Segmentation	Shelf&Tote Training Dataset	ARI	0.55	GENESIS-V2
Instance Segmentation	Shelf&Tote Training Dataset	ARI	0.11	MONET-G
Instance Segmentation	Shelf&Tote Training Dataset	ARI	0.04	GENESIS
Instance Segmentation	Shelf&Tote Training Dataset	ARI	0.03	SlotAttention
Instance Segmentation	ShapeStacks	ARI-FG	0.81	GENESIS-V2
Instance Segmentation	ShapeStacks	ARI-FG	0.76	SlotAttention
Instance Segmentation	ShapeStacks	ARI-FG	0.7	GENESIS
Instance Segmentation	ShapeStacks	ARI-FG	0.7	MONET-G
Instance Segmentation	ObjectsRoom	ARI-FG	0.84	GENESIS-V2
Instance Segmentation	ObjectsRoom	ARI-FG	0.79	SlotAttention
Instance Segmentation	ObjectsRoom	ARI-FG	0.63	GENESIS
Instance Segmentation	ObjectsRoom	ARI-FG	0.54	MONET-G
Unsupervised Object Segmentation	Shelf&Tote Training Dataset	ARI	0.55	GENESIS-V2
Unsupervised Object Segmentation	Shelf&Tote Training Dataset	ARI	0.11	MONET-G
Unsupervised Object Segmentation	Shelf&Tote Training Dataset	ARI	0.04	GENESIS
Unsupervised Object Segmentation	Shelf&Tote Training Dataset	ARI	0.03	SlotAttention
Unsupervised Object Segmentation	ShapeStacks	ARI-FG	0.81	GENESIS-V2
Unsupervised Object Segmentation	ShapeStacks	ARI-FG	0.76	SlotAttention
Unsupervised Object Segmentation	ShapeStacks	ARI-FG	0.7	GENESIS
Unsupervised Object Segmentation	ShapeStacks	ARI-FG	0.7	MONET-G
Unsupervised Object Segmentation	ObjectsRoom	ARI-FG	0.84	GENESIS-V2
Unsupervised Object Segmentation	ObjectsRoom	ARI-FG	0.79	SlotAttention
Unsupervised Object Segmentation	ObjectsRoom	ARI-FG	0.63	GENESIS
Unsupervised Object Segmentation	ObjectsRoom	ARI-FG	0.54	MONET-G

GENESIS-V2: Inferring Unordered Object Representations without Iterative Refinement

Abstract

Results

Related Papers

GENESIS-V2: Inferring Unordered Object Representations without Iterative Refinement

Abstract

Results

Related Papers