TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ProDepth: Boosting Self-Supervised Multi-Frame Monocular D...

ProDepth: Boosting Self-Supervised Multi-Frame Monocular Depth with Probabilistic Fusion

Sungmin Woo, Wonjoon Lee, Woo Jin Kim, Dogyoon Lee, Sangyoun Lee

2024-07-12Unsupervised Monocular Depth EstimationDepth PredictionDepth EstimationMonocular Depth Estimation
PaperPDFCode(official)

Abstract

Self-supervised multi-frame monocular depth estimation relies on the geometric consistency between successive frames under the assumption of a static scene. However, the presence of moving objects in dynamic scenes introduces inevitable inconsistencies, causing misaligned multi-frame feature matching and misleading self-supervision during training. In this paper, we propose a novel framework called ProDepth, which effectively addresses the mismatch problem caused by dynamic objects using a probabilistic approach. We initially deduce the uncertainty associated with static scene assumption by adopting an auxiliary decoder. This decoder analyzes inconsistencies embedded in the cost volume, inferring the probability of areas being dynamic. We then directly rectify the erroneous cost volume for dynamic areas through a Probabilistic Cost Volume Modulation (PCVM) module. Specifically, we derive probability distributions of depth candidates from both single-frame and multi-frame cues, modulating the cost volume by adaptively fusing those distributions based on the inferred uncertainty. Additionally, we present a self-supervision loss reweighting strategy that not only masks out incorrect supervision with high uncertainty but also mitigates the risks in remaining possible dynamic areas in accordance with the probability. Our proposed method excels over state-of-the-art approaches in all metrics on both Cityscapes and KITTI datasets, and demonstrates superior generalization ability on the Waymo Open dataset.

Results

TaskDatasetMetricValueModel
Depth EstimationKITTI Eigen split unsupervisedDelta < 1.250.918ProDepth
Depth EstimationKITTI Eigen split unsupervisedDelta < 1.25^20.969ProDepth
Depth EstimationKITTI Eigen split unsupervisedDelta < 1.25^30.984ProDepth
Depth EstimationKITTI Eigen split unsupervisedRMSE4.139ProDepth
Depth EstimationKITTI Eigen split unsupervisedRMSE log0.166ProDepth
Depth EstimationKITTI Eigen split unsupervisedSq Rel0.629ProDepth
Depth EstimationKITTI Eigen split unsupervisedabsolute relative error0.086ProDepth
Depth EstimationKITTI Eigen split unsupervisedDelta < 1.250.902ProDepth(M+640x192)
Depth EstimationKITTI Eigen split unsupervisedDelta < 1.25^20.967ProDepth(M+640x192)
Depth EstimationKITTI Eigen split unsupervisedDelta < 1.25^30.985ProDepth(M+640x192)
Depth EstimationKITTI Eigen split unsupervisedRMSE4.345ProDepth(M+640x192)
Depth EstimationKITTI Eigen split unsupervisedRMSE log0.172ProDepth(M+640x192)
Depth EstimationKITTI Eigen split unsupervisedSq Rel0.693ProDepth(M+640x192)
Depth EstimationKITTI Eigen split unsupervisedabsolute relative error0.095ProDepth(M+640x192)
3DKITTI Eigen split unsupervisedDelta < 1.250.918ProDepth
3DKITTI Eigen split unsupervisedDelta < 1.25^20.969ProDepth
3DKITTI Eigen split unsupervisedDelta < 1.25^30.984ProDepth
3DKITTI Eigen split unsupervisedRMSE4.139ProDepth
3DKITTI Eigen split unsupervisedRMSE log0.166ProDepth
3DKITTI Eigen split unsupervisedSq Rel0.629ProDepth
3DKITTI Eigen split unsupervisedabsolute relative error0.086ProDepth
3DKITTI Eigen split unsupervisedDelta < 1.250.902ProDepth(M+640x192)
3DKITTI Eigen split unsupervisedDelta < 1.25^20.967ProDepth(M+640x192)
3DKITTI Eigen split unsupervisedDelta < 1.25^30.985ProDepth(M+640x192)
3DKITTI Eigen split unsupervisedRMSE4.345ProDepth(M+640x192)
3DKITTI Eigen split unsupervisedRMSE log0.172ProDepth(M+640x192)
3DKITTI Eigen split unsupervisedSq Rel0.693ProDepth(M+640x192)
3DKITTI Eigen split unsupervisedabsolute relative error0.095ProDepth(M+640x192)

Related Papers

$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network2025-07-15Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation2025-07-15Cameras as Relative Positional Encoding2025-07-14ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way2025-07-11