TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/iDisc: Internal Discretization for Monocular Depth Estimat...

iDisc: Internal Discretization for Monocular Depth Estimation

Luigi Piccinelli, Christos Sakaridis, Fisher Yu

2023-04-13CVPR 2023 1Surface Normals EstimationSurface Normal EstimationScene UnderstandingAutonomous DrivingDepth EstimationMonocular Depth Estimation
PaperPDFCode(official)Code

Abstract

Monocular depth estimation is fundamental for 3D scene understanding and downstream applications. However, even under the supervised setup, it is still challenging and ill-posed due to the lack of full geometric constraints. Although a scene can consist of millions of pixels, there are fewer high-level patterns. We propose iDisc to learn those patterns with internal discretized representations. The method implicitly partitions the scene into a set of high-level patterns. In particular, our new module, Internal Discretization (ID), implements a continuous-discrete-continuous bottleneck to learn those concepts without supervision. In contrast to state-of-the-art methods, the proposed model does not enforce any explicit constraints or priors on the depth output. The whole network with the ID module can be trained end-to-end, thanks to the bottleneck module based on attention. Our method sets the new state of the art with significant improvements on NYU-Depth v2 and KITTI, outperforming all published methods on the official KITTI benchmark. iDisc can also achieve state-of-the-art results on surface normal estimation. Further, we explore the model generalization capability via zero-shot testing. We observe the compelling need to promote diversification in the outdoor scenario. Hence, we introduce splits of two autonomous driving datasets, DDAD and Argoverse. Code is available at http://vis.xyz/pub/idisc .

Results

TaskDatasetMetricValueModel
Depth EstimationNYU-Depth V2Delta < 1.25^20.993iDisc
Depth EstimationNYU-Depth V2Delta < 1.25^30.999iDisc
Depth EstimationNYU-Depth V2absolute relative error0.086iDisc
Depth EstimationKITTI Eigen splitDelta < 1.250.977iDisc
Depth EstimationKITTI Eigen splitDelta < 1.25^20.997iDisc
Depth EstimationKITTI Eigen splitDelta < 1.25^30.999iDisc
Depth EstimationKITTI Eigen splitRMSE2.067iDisc
Depth EstimationKITTI Eigen splitRMSE log0.077iDisc
Depth EstimationKITTI Eigen splitSq Rel0.145iDisc
Depth EstimationKITTI Eigen splitabsolute relative error0.05iDisc
3DNYU-Depth V2Delta < 1.25^20.993iDisc
3DNYU-Depth V2Delta < 1.25^30.999iDisc
3DNYU-Depth V2absolute relative error0.086iDisc
3DKITTI Eigen splitDelta < 1.250.977iDisc
3DKITTI Eigen splitDelta < 1.25^20.997iDisc
3DKITTI Eigen splitDelta < 1.25^30.999iDisc
3DKITTI Eigen splitRMSE2.067iDisc
3DKITTI Eigen splitRMSE log0.077iDisc
3DKITTI Eigen splitSq Rel0.145iDisc
3DKITTI Eigen splitabsolute relative error0.05iDisc
Surface Normals EstimationNYU Depth v2% < 11.2563.8iDisc
Surface Normals EstimationNYU Depth v2% < 22.579.8iDisc
Surface Normals EstimationNYU Depth v2% < 3085.6iDisc
Surface Normals EstimationNYU Depth v2Mean Angle Error14.6iDisc
Surface Normals EstimationNYU Depth v2RMSE22.8iDisc

Related Papers

GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving2025-07-19AGENTS-LLM: Augmentative GENeration of Challenging Traffic Scenarios with an Agentic LLM Framework2025-07-18Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection2025-07-17Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models2025-07-17Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17