TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Learning Deep Features for Discriminative Localization

Learning Deep Features for Discriminative Localization

Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba

2015-12-14CVPR 2016 6Object LocalizationWeakly-Supervised Object Localization
PaperPDFCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCode(official)CodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCode

Abstract

In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network to have remarkable localization ability despite being trained on image-level labels. While this technique was previously proposed as a means for regularizing training, we find that it actually builds a generic localizable deep representation that can be applied to a variety of tasks. Despite the apparent simplicity of global average pooling, we are able to achieve 37.1% top-5 error for object localization on ILSVRC 2014, which is remarkably close to the 34.2% top-5 error achieved by a fully supervised CNN approach. We demonstrate that our network is able to localize the discriminative image regions on a variety of tasks despite not being trained for them

Results

TaskDatasetMetricValueModel
Object LocalizationILSVRC 2015Top-1 Error Rate67.19AlexNet-GAP
Object LocalizationILSVRC 2016Top-5 Error45.14VGGnet-GAP
Object LocalizationILSVRC 2016Top-5 Error52.16AlexNet-GAP
Object LocalizationTiny ImageNetTop-1 Localization Accuracy40.55CAM

Related Papers

Mask-aware Text-to-Image Retrieval: Referring Expression Segmentation Meets Cross-modal Retrieval2025-06-28VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding2025-06-28RAG-6DPose: Retrieval-Augmented 6D Pose Estimation via Leveraging CAD as Knowledge Base2025-06-23CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion2025-06-17UAV Object Detection and Positioning in a Mining Industrial Metaverse with Custom Geo-Referenced Data2025-06-16WoMAP: World Models For Embodied Open-Vocabulary Object Localization2025-06-02Multispectral Detection Transformer with Infrared-Centric Sensor Fusion2025-05-21Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels2025-05-20