TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Unsupervised Scale-consistent Depth Learning from Video

Unsupervised Scale-consistent Depth Learning from Video

Jia-Wang Bian, Huangying Zhan, Naiyan Wang, Zhichao Li, Le Zhang, Chunhua Shen, Ming-Ming Cheng, Ian Reid

2021-05-25Monocular Visual OdometryDepth EstimationSimultaneous Localization and MappingMonocular Depth Estimation
PaperPDFCode(official)Code

Abstract

We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training and enables the scale-consistent prediction at inference time. Our contributions include: (i) we propose a geometry consistency loss, which penalizes the inconsistency of predicted depths between adjacent views; (ii) we propose a self-discovered mask to automatically localize moving objects that violate the underlying static scene assumption and cause noisy signals during training; (iii) we demonstrate the efficacy of each component with a detailed ablation study and show high-quality depth estimation results in both KITTI and NYUv2 datasets. Moreover, thanks to the capability of scale-consistent prediction, we show that our monocular-trained deep networks are readily integrated into the ORB-SLAM2 system for more robust and accurate tracking. The proposed hybrid Pseudo-RGBD SLAM shows compelling results in KITTI, and it generalizes well to the KAIST dataset without additional training. Finally, we provide several demos for qualitative evaluation.

Results

TaskDatasetMetricValueModel
Depth EstimationKITTI Eigen splitDelta < 1.250.873SC-Depth (ResNet 50)
Depth EstimationKITTI Eigen splitDelta < 1.25^20.96SC-Depth (ResNet 50)
Depth EstimationKITTI Eigen splitDelta < 1.25^30.982SC-Depth (ResNet 50)
Depth EstimationKITTI Eigen splitRMSE4.706SC-Depth (ResNet 50)
Depth EstimationKITTI Eigen splitRMSE log0.191SC-Depth (ResNet 50)
Depth EstimationKITTI Eigen splitabsolute relative error0.114SC-Depth (ResNet 50)
Depth EstimationKITTI Eigen splitDelta < 1.250.863SC-Depth (ResNet18)
Depth EstimationKITTI Eigen splitDelta < 1.25^20.957SC-Depth (ResNet18)
Depth EstimationKITTI Eigen splitDelta < 1.25^30.981SC-Depth (ResNet18)
Depth EstimationKITTI Eigen splitRMSE4.95SC-Depth (ResNet18)
Depth EstimationKITTI Eigen splitRMSE log0.197SC-Depth (ResNet18)
Depth EstimationKITTI Eigen splitabsolute relative error0.119SC-Depth (ResNet18)
Depth EstimationNYU-Depth V2 self-supervisedAbsolute relative error (AbsRel)0.157Bian et al
Depth EstimationNYU-Depth V2 self-supervisedRoot mean square error (RMSE)0.593Bian et al
Depth EstimationNYU-Depth V2 self-superviseddelta_178Bian et al
Depth EstimationNYU-Depth V2 self-superviseddelta_294Bian et al
Depth EstimationNYU-Depth V2 self-superviseddelta_398.4Bian et al
3DKITTI Eigen splitDelta < 1.250.873SC-Depth (ResNet 50)
3DKITTI Eigen splitDelta < 1.25^20.96SC-Depth (ResNet 50)
3DKITTI Eigen splitDelta < 1.25^30.982SC-Depth (ResNet 50)
3DKITTI Eigen splitRMSE4.706SC-Depth (ResNet 50)
3DKITTI Eigen splitRMSE log0.191SC-Depth (ResNet 50)
3DKITTI Eigen splitabsolute relative error0.114SC-Depth (ResNet 50)
3DKITTI Eigen splitDelta < 1.250.863SC-Depth (ResNet18)
3DKITTI Eigen splitDelta < 1.25^20.957SC-Depth (ResNet18)
3DKITTI Eigen splitDelta < 1.25^30.981SC-Depth (ResNet18)
3DKITTI Eigen splitRMSE4.95SC-Depth (ResNet18)
3DKITTI Eigen splitRMSE log0.197SC-Depth (ResNet18)
3DKITTI Eigen splitabsolute relative error0.119SC-Depth (ResNet18)
3DNYU-Depth V2 self-supervisedAbsolute relative error (AbsRel)0.157Bian et al
3DNYU-Depth V2 self-supervisedRoot mean square error (RMSE)0.593Bian et al
3DNYU-Depth V2 self-superviseddelta_178Bian et al
3DNYU-Depth V2 self-superviseddelta_294Bian et al
3DNYU-Depth V2 self-superviseddelta_398.4Bian et al

Related Papers

DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network2025-07-15Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation2025-07-15Cameras as Relative Positional Encoding2025-07-14