TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Monocular Depth Estimation using Diffusion Models

Monocular Depth Estimation using Diffusion Models

Saurabh Saxena, Abhishek Kar, Mohammad Norouzi, David J. Fleet

2023-02-28DenoisingImputationDepth EstimationText to 3DImage GenerationMonocular Depth EstimationImage-to-Image Translation
PaperPDF

Abstract

We formulate monocular depth estimation using denoising diffusion models, inspired by their recent successes in high fidelity image generation. To that end, we introduce innovations to address problems arising due to noisy, incomplete depth maps in training data, including step-unrolled denoising diffusion, an $L_1$ loss, and depth infilling during training. To cope with the limited availability of data for supervised training, we leverage pre-training on self-supervised image-to-image translation tasks. Despite the simplicity of the approach, with a generic loss and architecture, our DepthGen model achieves SOTA performance on the indoor NYU dataset, and near SOTA results on the outdoor KITTI dataset. Further, with a multimodal posterior, DepthGen naturally represents depth ambiguity (e.g., from transparent surfaces), and its zero-shot performance combined with depth imputation, enable a simple but effective text-to-3D pipeline. Project page: https://depth-gen.github.io

Results

TaskDatasetMetricValueModel
Depth EstimationNYU-Depth V2Delta < 1.250.946DepthGen
Depth EstimationNYU-Depth V2Delta < 1.25^20.987DepthGen
Depth EstimationNYU-Depth V2Delta < 1.25^30.996DepthGen
Depth EstimationNYU-Depth V2RMSE0.314DepthGen
Depth EstimationNYU-Depth V2absolute relative error0.074DepthGen
Depth EstimationNYU-Depth V2log 100.032DepthGen
3DNYU-Depth V2Delta < 1.250.946DepthGen
3DNYU-Depth V2Delta < 1.25^20.987DepthGen
3DNYU-Depth V2Delta < 1.25^30.996DepthGen
3DNYU-Depth V2RMSE0.314DepthGen
3DNYU-Depth V2absolute relative error0.074DepthGen
3DNYU-Depth V2log 100.032DepthGen

Related Papers

Missing value imputation with adversarial random forests -- MissARF2025-07-21fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling2025-07-17$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17