A Two-Streamed Network for Estimating Fine-Scaled Depth Maps from Single RGB Images

Jun Li, Reinhard Klein, Angela Yao

2016-07-04ICCV 2017 10Monocular Depth Estimation

Abstract

Estimating depth from a single RGB image is an ill-posed and inherently ambiguous problem. State-of-the-art deep learning methods can now estimate accurate 2D depth maps, but when the maps are projected into 3D, they lack local detail and are often highly distorted. We propose a fast-to-train two-streamed CNN that predicts depth and depth gradients, which are then fused together into an accurate and detailed depth map. We also define a novel set loss over multiple images; by regularizing the estimation between a common set of images, the network is less prone to over-fitting and achieves better accuracy than competing methods. Experiments on the NYU Depth v2 dataset shows that our depth predictions are competitive with state-of-the-art and lead to faithful 3D projections.

Results

Task	Dataset	Metric	Value	Model
Depth Estimation	NYU-Depth V2	RMSE	0.635	Li et al.
3D	NYU-Depth V2	RMSE	0.635	Li et al.

Related Papers

Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16 MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network2025-07-15 ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way2025-07-11 Beyond Appearance: Geometric Cues for Robust Video Instance Segmentation2025-07-08 LighthouseGS: Indoor Structure-aware 3D Gaussian Splatting for Panorama-Style Mobile Captures2025-07-08 Underwater Monocular Metric Depth Estimation: Real-World Benchmarks and Synthetic Fine-Tuning2025-07-02 THIRDEYE: Cue-Aware Monocular Depth Estimation via Brain-Inspired Multi-Stage Fusion2025-06-25 Look to Locate: Vision-Based Multisensory Navigation with 3-D Digital Maps for GNSS-Challenged Environments2025-06-24