DepthMaster: Taming Diffusion Models for Monocular Depth Estimation

Ziyang Song, Zerong Wang, Bo Li, Hao Zhang, Ruijie Zhu, Li Liu, Peng-Tao Jiang, Tianzhu Zhang

2025-01-05Denoising Depth Estimation Monocular Depth Estimation

Abstract

Monocular depth estimation within the diffusion-denoising paradigm demonstrates impressive generalization ability but suffers from low inference speed. Recent methods adopt a single-step deterministic paradigm to improve inference efficiency while maintaining comparable performance. However, they overlook the gap between generative and discriminative features, leading to suboptimal results. In this work, we propose DepthMaster, a single-step diffusion model designed to adapt generative features for the discriminative depth estimation task. First, to mitigate overfitting to texture details introduced by generative features, we propose a Feature Alignment module, which incorporates high-quality semantic features to enhance the denoising network's representation capability. Second, to address the lack of fine-grained details in the single-step deterministic framework, we propose a Fourier Enhancement module to adaptively balance low-frequency structure and high-frequency details. We adopt a two-stage training strategy to fully leverage the potential of the two modules. In the first stage, we focus on learning the global scene structure with the Feature Alignment module, while in the second stage, we exploit the Fourier Enhancement module to improve the visual quality. Through these efforts, our model achieves state-of-the-art performance in terms of generalization and detail preservation, outperforming other diffusion-based methods across various datasets. Our project page can be found at https://indu1ge.github.io/DepthMaster_page.

Results

Task	Dataset	Metric	Value	Model
Depth Estimation	NYU-Depth V2	Delta < 1.25	0.972	DepthMaster
Depth Estimation	NYU-Depth V2	absolute relative error	0.05	DepthMaster
Depth Estimation	ETH3D	Delta < 1.25	0.974	DepthMaster
Depth Estimation	ETH3D	absolute relative error	0.053	DepthMaster
Depth Estimation	KITTI Eigen split	Delta < 1.25	0.937	DepthMaster
Depth Estimation	KITTI Eigen split	absolute relative error	0.082	DepthMaster
3D	NYU-Depth V2	Delta < 1.25	0.972	DepthMaster
3D	NYU-Depth V2	absolute relative error	0.05	DepthMaster
3D	ETH3D	Delta < 1.25	0.974	DepthMaster
3D	ETH3D	absolute relative error	0.053	DepthMaster
3D	KITTI Eigen split	Delta < 1.25	0.937	DepthMaster
3D	KITTI Eigen split	absolute relative error	0.082	DepthMaster

DepthMaster: Taming Diffusion Models for Monocular Depth Estimation

Abstract

Results

Related Papers

DepthMaster: Taming Diffusion Models for Monocular Depth Estimation

Abstract

Results

Related Papers