TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Unsupervised Monocular Depth Estimation with Left-Right Co...

Unsupervised Monocular Depth Estimation with Left-Right Consistency

Clément Godard, Oisin Mac Aodha, Gabriel J. Brostow

2016-09-13CVPR 2017 7Image ReconstructionUnsupervised Monocular Depth EstimationDepth PredictionDepth EstimationMonocular Depth Estimation
PaperPDFCodeCodeCode(official)CodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCodeCode

Abstract

Learning based methods have shown very promising results for the task of depth estimation in single images. However, most existing approaches treat depth prediction as a supervised regression problem and as a result, require vast quantities of corresponding ground truth depth data for training. Just recording quality depth data in a range of environments is a challenging problem. In this paper, we innovate beyond existing approaches, replacing the use of explicit depth data during training with easier-to-obtain binocular stereo footage. We propose a novel training objective that enables our convolutional neural network to learn to perform single image depth estimation, despite the absence of ground truth depth data. Exploiting epipolar geometry constraints, we generate disparity images by training our network with an image reconstruction loss. We show that solving for image reconstruction alone results in poor quality depth images. To overcome this problem, we propose a novel training loss that enforces consistency between the disparities produced relative to both the left and right images, leading to improved performance and robustness compared to existing approaches. Our method produces state of the art results for monocular depth estimation on the KITTI driving dataset, even outperforming supervised methods that have been trained with ground truth depth.

Results

TaskDatasetMetricValueModel
Depth EstimationMid-Air DatasetAbs Rel0.3136Monodepth
Depth EstimationMid-Air DatasetRMSE13.595Monodepth
Depth EstimationMid-Air DatasetRMSE log0.438Monodepth
Depth EstimationMid-Air DatasetSQ Rel8.7127Monodepth
Depth EstimationKITTI Eigen split unsupervisedabsolute relative error0.133Monodepth S
3DMid-Air DatasetAbs Rel0.3136Monodepth
3DMid-Air DatasetRMSE13.595Monodepth
3DMid-Air DatasetRMSE log0.438Monodepth
3DMid-Air DatasetSQ Rel8.7127Monodepth
3DKITTI Eigen split unsupervisedabsolute relative error0.133Monodepth S

Related Papers

$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16The model is the message: Lightweight convolutional autoencoders applied to noisy imaging data for planetary science and astrobiology2025-07-153D Magnetic Inverse Routine for Single-Segment Magnetic Field Images2025-07-15MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network2025-07-15Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation2025-07-15