TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Hide-and-Seek: Forcing a Network to be Meticulous for Weak...

Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-supervised Object and Action Localization

Krishna Kumar Singh, Yong Jae Lee

2017-04-13ICCV 2017 10Weakly Supervised Action LocalizationAction LocalizationObject LocalizationWeakly-Supervised Object Localization
PaperPDFCodeCodeCode

Abstract

We propose `Hide-and-Seek', a weakly-supervised framework that aims to improve object localization in images and action localization in videos. Most existing weakly-supervised methods localize only the most discriminative parts of an object rather than all relevant parts, which leads to suboptimal performance. Our key idea is to hide patches in a training image randomly, forcing the network to seek other relevant parts when the most discriminative part is hidden. Our approach only needs to modify the input image and can work with any network designed for object localization. During testing, we do not need to hide any patches. Our Hide-and-Seek approach obtains superior performance compared to previous methods for weakly-supervised object localization on the ILSVRC dataset. We also demonstrate that our framework can be easily extended to weakly-supervised action localization.

Results

TaskDatasetMetricValueModel
VideoTHUMOS 2014mAP@0.56.8Hide-and-seek
Temporal Action LocalizationTHUMOS 2014mAP@0.56.8Hide-and-seek
Zero-Shot LearningTHUMOS 2014mAP@0.56.8Hide-and-seek
Action LocalizationTHUMOS 2014mAP@0.56.8Hide-and-seek
Weakly Supervised Action LocalizationTHUMOS 2014mAP@0.56.8Hide-and-seek

Related Papers

Mask-aware Text-to-Image Retrieval: Referring Expression Segmentation Meets Cross-modal Retrieval2025-06-28VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding2025-06-28RAG-6DPose: Retrieval-Augmented 6D Pose Estimation via Leveraging CAD as Knowledge Base2025-06-23CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion2025-06-17UAV Object Detection and Positioning in a Mining Industrial Metaverse with Custom Geo-Referenced Data2025-06-16Zero-Shot Temporal Interaction Localization for Egocentric Videos2025-06-04WoMAP: World Models For Embodied Open-Vocabulary Object Localization2025-06-02LLM-powered Query Expansion for Enhancing Boundary Prediction in Language-driven Action Localization2025-05-30