TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/UniDepth: Universal Monocular Metric Depth Estimation

UniDepth: Universal Monocular Metric Depth Estimation

Luigi Piccinelli, Yung-Hsu Yang, Christos Sakaridis, Mattia Segu, Siyuan Li, Luc van Gool, Fisher Yu

2024-03-27CVPR 2024 1Depth EstimationMonocular Depth Estimation
PaperPDFCodeCode(official)Code

Abstract

Accurate monocular metric depth estimation (MMDE) is crucial to solving downstream tasks in 3D perception and modeling. However, the remarkable accuracy of recent MMDE methods is confined to their training domains. These methods fail to generalize to unseen domains even in the presence of moderate domain gaps, which hinders their practical applicability. We propose a new model, UniDepth, capable of reconstructing metric 3D scenes from solely single images across domains. Departing from the existing MMDE methods, UniDepth directly predicts metric 3D points from the input image at inference time without any additional information, striving for a universal and flexible MMDE solution. In particular, UniDepth implements a self-promptable camera module predicting dense camera representation to condition depth features. Our model exploits a pseudo-spherical output representation, which disentangles camera and depth representations. In addition, we propose a geometric invariance loss that promotes the invariance of camera-prompted depth features. Thorough evaluations on ten datasets in a zero-shot regime consistently demonstrate the superior performance of UniDepth, even when compared with methods directly trained on the testing domains. Code and models are available at: https://github.com/lpiccinelli-eth/unidepth

Results

TaskDatasetMetricValueModel
Depth EstimationNYU-Depth V2Delta < 1.250.984UniDepth (Zero-shot)
Depth EstimationNYU-Depth V2Delta < 1.25^20.997UniDepth (Zero-shot)
Depth EstimationNYU-Depth V2Delta < 1.25^30.999UniDepth (Zero-shot)
Depth EstimationNYU-Depth V2RMSE0.201UniDepth (Zero-shot)
Depth EstimationNYU-Depth V2absolute relative error0.058UniDepth (Zero-shot)
Depth EstimationNYU-Depth V2log 100.024UniDepth (Zero-shot)
Depth EstimationKITTI Eigen splitDelta < 1.250.986UniDepth (Zero-shot)
Depth EstimationKITTI Eigen splitDelta < 1.25^20.998UniDepth (Zero-shot)
Depth EstimationKITTI Eigen splitDelta < 1.25^30.999UniDepth (Zero-shot)
Depth EstimationKITTI Eigen splitRMSE1.75UniDepth (Zero-shot)
Depth EstimationKITTI Eigen splitRMSE log0.064UniDepth (Zero-shot)
Depth EstimationKITTI Eigen splitabsolute relative error0.042UniDepth (Zero-shot)
3DNYU-Depth V2Delta < 1.250.984UniDepth (Zero-shot)
3DNYU-Depth V2Delta < 1.25^20.997UniDepth (Zero-shot)
3DNYU-Depth V2Delta < 1.25^30.999UniDepth (Zero-shot)
3DNYU-Depth V2RMSE0.201UniDepth (Zero-shot)
3DNYU-Depth V2absolute relative error0.058UniDepth (Zero-shot)
3DNYU-Depth V2log 100.024UniDepth (Zero-shot)
3DKITTI Eigen splitDelta < 1.250.986UniDepth (Zero-shot)
3DKITTI Eigen splitDelta < 1.25^20.998UniDepth (Zero-shot)
3DKITTI Eigen splitDelta < 1.25^30.999UniDepth (Zero-shot)
3DKITTI Eigen splitRMSE1.75UniDepth (Zero-shot)
3DKITTI Eigen splitRMSE log0.064UniDepth (Zero-shot)
3DKITTI Eigen splitabsolute relative error0.042UniDepth (Zero-shot)

Related Papers

$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network2025-07-15Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation2025-07-15Cameras as Relative Positional Encoding2025-07-14ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way2025-07-11