Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Bingxin Ke, Anton Obukhov, Shengyu Huang, Nando Metzger, Rodrigo Caye Daudt, Konrad Schindler

2023-12-04CVPR 2024 1Zero-shot Generalization Scene Understanding Depth Estimation Monocular Depth Estimation

Abstract

Monocular depth estimation is a fundamental computer vision task. Recovering 3D depth from a single image is geometrically ill-posed and requires scene understanding, so it is not surprising that the rise of deep learning has led to a breakthrough. The impressive progress of monocular depth estimators has mirrored the growth in model capacity, from relatively modest CNNs to large Transformer architectures. Still, monocular depth estimators tend to struggle when presented with images with unfamiliar content and layout, since their knowledge of the visual world is restricted by the data seen during training, and challenged by zero-shot generalization to new domains. This motivates us to explore whether the extensive priors captured in recent generative diffusion models can enable better, more generalizable depth estimation. We introduce Marigold, a method for affine-invariant monocular depth estimation that is derived from Stable Diffusion and retains its rich prior knowledge. The estimator can be fine-tuned in a couple of days on a single GPU using only synthetic training data. It delivers state-of-the-art performance across a wide range of datasets, including over 20% performance gains in specific cases. Project page: https://marigoldmonodepth.github.io.

Results

Task	Dataset	Metric	Value	Model
Depth Estimation	NYU-Depth V2	Delta < 1.25	0.964	Marigold
Depth Estimation	NYU-Depth V2	Delta < 1.25^2	0.991	Marigold
Depth Estimation	NYU-Depth V2	Delta < 1.25^3	0.998	Marigold
Depth Estimation	NYU-Depth V2	RMSE	0.224	Marigold
Depth Estimation	NYU-Depth V2	absolute relative error	0.055	Marigold
Depth Estimation	NYU-Depth V2	log 10	0.024	Marigold
Depth Estimation	ETH3D	Delta < 1.25	0.096	Marigold
Depth Estimation	ETH3D	absolute relative error	0.065	Marigold
Depth Estimation	KITTI Eigen split	Delta < 1.25	0.916	Marigold
Depth Estimation	KITTI Eigen split	Delta < 1.25^2	0.987	Marigold
Depth Estimation	KITTI Eigen split	Delta < 1.25^3	0.996	Marigold
Depth Estimation	KITTI Eigen split	RMSE	3.304	Marigold
Depth Estimation	KITTI Eigen split	RMSE log	0.138	Marigold
Depth Estimation	KITTI Eigen split	absolute relative error	0.099	Marigold
3D	NYU-Depth V2	Delta < 1.25	0.964	Marigold
3D	NYU-Depth V2	Delta < 1.25^2	0.991	Marigold
3D	NYU-Depth V2	Delta < 1.25^3	0.998	Marigold
3D	NYU-Depth V2	RMSE	0.224	Marigold
3D	NYU-Depth V2	absolute relative error	0.055	Marigold
3D	NYU-Depth V2	log 10	0.024	Marigold
3D	ETH3D	Delta < 1.25	0.096	Marigold
3D	ETH3D	absolute relative error	0.065	Marigold
3D	KITTI Eigen split	Delta < 1.25	0.916	Marigold
3D	KITTI Eigen split	Delta < 1.25^2	0.987	Marigold
3D	KITTI Eigen split	Delta < 1.25^3	0.996	Marigold
3D	KITTI Eigen split	RMSE	3.304	Marigold
3D	KITTI Eigen split	RMSE log	0.138	Marigold
3D	KITTI Eigen split	absolute relative error	0.099	Marigold

Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Abstract

Results

Related Papers

Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Abstract

Results

Related Papers