TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Object-Centric Learning with Slot Attention

Object-Centric Learning with Slot Attention

Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf

2020-06-26NeurIPS 2020 12Object Discovery
PaperPDFCodeCodeCodeCode(official)CodeCodeCodeCode

Abstract

Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features. Yet, most deep learning approaches learn distributed representations that do not capture the compositional properties of natural scenes. In this paper, we present the Slot Attention module, an architectural component that interfaces with perceptual representations such as the output of a convolutional neural network and produces a set of task-dependent abstract representations which we call slots. These slots are exchangeable and can bind to any object in the input by specializing through a competitive procedure over multiple rounds of attention. We empirically demonstrate that Slot Attention can extract object-centric representations that enable generalization to unseen compositions when trained on unsupervised object discovery and supervised property prediction tasks.

Related Papers

When Does Pruning Benefit Vision Representations?2025-07-02FORLA:Federated Object-centric Representation Learning with Slot Attention2025-06-03Binding threshold units with artificial oscillatory neurons2025-05-06Hierarchical Compact Clustering Attention (COCA) for Unsupervised Object-Centric Learning2025-05-04Are We Done with Object-Centric Learning?2025-04-09CTRL-O: Language-Controllable Object-Centric Visual Representation Learning2025-03-27xMOD: Cross-Modal Distillation for 2D/3D Multi-Object Discovery from 2D motion2025-03-19OV-SCAN: Semantically Consistent Alignment for Novel Object Discovery in Open-Vocabulary 3D Object Detection2025-03-09