GOOD: Exploring Geometric Cues for Detecting Objects in an Open World

Haiwen Huang, Andreas Geiger, Dan Zhang

2022-12-22Open World Object Detection Class-agnostic Object Detection object-detection Object Detection

Abstract

We address the task of open-world class-agnostic object detection, i.e., detecting every object in an image by learning from a limited number of base object classes. State-of-the-art RGB-based models suffer from overfitting the training classes and often fail at detecting novel-looking objects. This is because RGB-based models primarily rely on appearance similarity to detect novel objects and are also prone to overfitting short-cut cues such as textures and discriminative parts. To address these shortcomings of RGB-based object detectors, we propose incorporating geometric cues such as depth and normals, predicted by general-purpose monocular estimators. Specifically, we use the geometric cues to train an object proposal network for pseudo-labeling unannotated novel objects in the training set. Our resulting Geometry-guided Open-world Object Detector (GOOD) significantly improves detection recall for novel object categories and already performs well with only a few training classes. Using a single "person" class for training on the COCO dataset, GOOD surpasses SOTA methods by 5.0% AR@100, a relative improvement of 24%.

Results

Task	Dataset	Metric	Value	Model
Object Detection	COCO VOC to non-VOC	AR100	39.7	GOOD
3D	COCO VOC to non-VOC	AR100	39.7	GOOD
2D Classification	COCO VOC to non-VOC	AR100	39.7	GOOD
2D Object Detection	COCO VOC to non-VOC	AR100	39.7	GOOD
16k	COCO VOC to non-VOC	AR100	39.7	GOOD

Related Papers

Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17 A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17 RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17 Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17 Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16 Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15 ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08 Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations2025-07-07