TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Toward unsupervised, multi-object discovery in large-scale...

Toward unsupervised, multi-object discovery in large-scale image collections

Huy V. Vo, Patrick Pérez, Jean Ponce

2020-07-06ECCV 2020 8Multi-object discoveryRegion ProposalMulti-object colocalizationObject DiscoverySingle-object colocalizationSingle-object discovery
PaperPDFCode(official)

Abstract

This paper addresses the problem of discovering the objects present in a collection of images without any supervision. We build on the optimization approach of Vo et al. (CVPR'19) with several key novelties: (1) We propose a novel saliency-based region proposal algorithm that achieves significantly higher overlap with ground-truth objects than other competitive methods. This procedure leverages off-the-shelf CNN features trained on classification tasks without any bounding box information, but is otherwise unsupervised. (2) We exploit the inherent hierarchical structure of proposals as an effective regularizer for the approach to object discovery of Vo et al., boosting its performance to significantly improve over the state of the art on several standard benchmarks. (3) We adopt a two-stage strategy to select promising proposals using small random sets of images before using the whole image collection to discover the objects it depicts, allowing us to tackle, for the first time (to the best of our knowledge), the discovery of multiple objects in each one of the pictures making up datasets with up to 20,000 images, an over five-fold increase compared to existing methods, and a first step toward true large-scale unsupervised image interpretation.

Results

TaskDatasetMetricValueModel
Multi-object colocalizationVOC_allDetection Rate49.4rOSD
Multi-object colocalizationVOC12Detection Rate51.5rOSD
Single-object discoveryObject DiscoveryCorLoc89.2rOSD
Single-object discoveryVOC_allCorLoc49.4Large-scale rOSD
Single-object discoveryVOC_allCorLoc49.3rOSD
Single-object discoveryVOC12CorLoc51.9Large-scale rOSD
Single-object discoveryVOC12CorLoc51.2rOSD
Single-object discoveryCOCO_20kCorLoc53rOSD + CAD
Single-object discoveryCOCO_20kCorLoc48.5rOSD
Single-object discoveryVOC_6x2CorLoc72.5rOSD
Multi-object discoveryVOC12Detection Rate41.2Large-scale rOSD
Multi-object discoveryVOC12Detection Rate40.4rOSD
Multi-object discoveryCOCO_20kDetection Rate12Large-scale rOSD
Multi-object discoveryVOC_allDetection Rate38.3Large-scale rOSD
Multi-object discoveryVOC_allDetection Rate37.6rOSD

Related Papers

When Does Pruning Benefit Vision Representations?2025-07-02Bridging Annotation Gaps: Transferring Labels to Align Object Detection Datasets2025-06-05FORLA:Federated Object-centric Representation Learning with Slot Attention2025-06-03Binding threshold units with artificial oscillatory neurons2025-05-06Hierarchical Compact Clustering Attention (COCA) for Unsupervised Object-Centric Learning2025-05-04Are We Done with Object-Centric Learning?2025-04-09CTRL-O: Language-Controllable Object-Centric Visual Representation Learning2025-03-27xMOD: Cross-Modal Distillation for 2D/3D Multi-Object Discovery from 2D motion2025-03-19