TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/DepthMaster: Taming Diffusion Models for Monocular Depth E...

DepthMaster: Taming Diffusion Models for Monocular Depth Estimation

Ziyang Song, Zerong Wang, Bo Li, Hao Zhang, Ruijie Zhu, Li Liu, Peng-Tao Jiang, Tianzhu Zhang

2025-01-05DenoisingDepth EstimationMonocular Depth Estimation
PaperPDFCode(official)

Abstract

Monocular depth estimation within the diffusion-denoising paradigm demonstrates impressive generalization ability but suffers from low inference speed. Recent methods adopt a single-step deterministic paradigm to improve inference efficiency while maintaining comparable performance. However, they overlook the gap between generative and discriminative features, leading to suboptimal results. In this work, we propose DepthMaster, a single-step diffusion model designed to adapt generative features for the discriminative depth estimation task. First, to mitigate overfitting to texture details introduced by generative features, we propose a Feature Alignment module, which incorporates high-quality semantic features to enhance the denoising network's representation capability. Second, to address the lack of fine-grained details in the single-step deterministic framework, we propose a Fourier Enhancement module to adaptively balance low-frequency structure and high-frequency details. We adopt a two-stage training strategy to fully leverage the potential of the two modules. In the first stage, we focus on learning the global scene structure with the Feature Alignment module, while in the second stage, we exploit the Fourier Enhancement module to improve the visual quality. Through these efforts, our model achieves state-of-the-art performance in terms of generalization and detail preservation, outperforming other diffusion-based methods across various datasets. Our project page can be found at https://indu1ge.github.io/DepthMaster_page.

Results

TaskDatasetMetricValueModel
Depth EstimationNYU-Depth V2Delta < 1.250.972DepthMaster
Depth EstimationNYU-Depth V2absolute relative error0.05DepthMaster
Depth EstimationETH3DDelta < 1.250.974DepthMaster
Depth EstimationETH3Dabsolute relative error0.053DepthMaster
Depth EstimationKITTI Eigen splitDelta < 1.250.937DepthMaster
Depth EstimationKITTI Eigen splitabsolute relative error0.082DepthMaster
3DNYU-Depth V2Delta < 1.250.972DepthMaster
3DNYU-Depth V2absolute relative error0.05DepthMaster
3DETH3DDelta < 1.250.974DepthMaster
3DETH3Dabsolute relative error0.053DepthMaster
3DKITTI Eigen splitDelta < 1.250.937DepthMaster
3DKITTI Eigen splitabsolute relative error0.082DepthMaster

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16HUG-VAS: A Hierarchical NURBS-Based Generative Model for Aortic Geometry Synthesis and Controllable Editing2025-07-15