TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MonoGRNet: A Geometric Reasoning Network for Monocular 3D ...

MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization

Zengyi Qin, Jinglu Wang, Yan Lu

2018-11-26Monocular 3D Object DetectionScene UnderstandingObject LocalizationDepth Estimation2D Object Detectionobject-detectionMonocular 3D Object Localization3D Object DetectionObject Detection
PaperPDFCode(official)

Abstract

Detecting and localizing objects in the real 3D space, which plays a crucial role in scene understanding, is particularly challenging given only a single RGB image due to the geometric information loss during imagery projection. We propose MonoGRNet for the amodal 3D object detection from a monocular RGB image via geometric reasoning in both the observed 2D projection and the unobserved depth dimension. MonoGRNet is a single, unified network composed of four task-specific subnetworks, responsible for 2D object detection, instance depth estimation (IDE), 3D localization and local corner regression. Unlike the pixel-level depth estimation that needs per-pixel annotations, we propose a novel IDE method that directly predicts the depth of the targeting 3D bounding box's center using sparse supervision. The 3D localization is further achieved by estimating the position in the horizontal and vertical dimensions. Finally, MonoGRNet is jointly learned by optimizing the locations and poses of the 3D bounding boxes in the global context. We demonstrate that MonoGRNet achieves state-of-the-art performance on challenging datasets.

Results

TaskDatasetMetricValueModel
Object DetectionKITTI Cars ModerateAP Medium5.74MonoGRNet
3DKITTI Cars ModerateAP Medium5.74MonoGRNet
3D Object DetectionKITTI Cars ModerateAP Medium5.74MonoGRNet
2D ClassificationKITTI Cars ModerateAP Medium5.74MonoGRNet
2D Object DetectionKITTI Cars ModerateAP Medium5.74MonoGRNet
16kKITTI Cars ModerateAP Medium5.74MonoGRNet

Related Papers

Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection2025-07-17Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17