Localizing Objects with Self-Supervised Transformers and no Labels

Oriane Siméoni, Gilles Puy, Huy V. Vo, Simon Roburin, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Renaud Marlet, Jean Ponce

2021-09-29Object Discovery Single-object discovery Weakly-Supervised Object Localization

Paper PDF Code Code(official)

Abstract

Localizing objects in image collections without supervision can help to avoid expensive annotation campaigns. We propose a simple approach to this problem, that leverages the activation features of a vision transformer pre-trained in a self-supervised manner. Our method, LOST, does not require any external object proposal nor any exploration of the image collection; it operates on a single image. Yet, we outperform state-of-the-art object discovery methods by up to 8 CorLoc points on PASCAL VOC 2012. We also show that training a class-agnostic detector on the discovered objects boosts results by another 7 points. Moreover, we show promising results on the unsupervised object discovery task. The code to reproduce our results can be found at https://github.com/valeoai/LOST.

Results

Task	Dataset	Metric	Value	Model
Object Localization	CUB-200-2011	Top-1 Localization Accuracy	71.3	LOST
Single-object discovery	COCO_20k	CorLoc	57.5	LOST + CAD
Single-object discovery	COCO_20k	CorLoc	50.7	LOST

Related Papers

When Does Pruning Benefit Vision Representations?2025-07-02 FORLA:Federated Object-centric Representation Learning with Slot Attention2025-06-03 Pro2SAM: Mask Prompt to SAM with Grid Points for Weakly Supervised Object Localization2025-05-08 Binding threshold units with artificial oscillatory neurons2025-05-06 Hierarchical Compact Clustering Attention (COCA) for Unsupervised Object-Centric Learning2025-05-04 Are We Done with Object-Centric Learning?2025-04-09 PixelCAM: Pixel Class Activation Mapping for Histology Image Classification and ROI Localization2025-03-31 CTRL-O: Language-Controllable Object-Centric Visual Representation Learning2025-03-27