Unsupervised Scale-consistent Depth Learning from Video

Jia-Wang Bian, Huangying Zhan, Naiyan Wang, Zhichao Li, Le Zhang, Chunhua Shen, Ming-Ming Cheng, Ian Reid

2021-05-25Monocular Visual Odometry Depth Estimation Simultaneous Localization and Mapping Monocular Depth Estimation

Abstract

We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training and enables the scale-consistent prediction at inference time. Our contributions include: (i) we propose a geometry consistency loss, which penalizes the inconsistency of predicted depths between adjacent views; (ii) we propose a self-discovered mask to automatically localize moving objects that violate the underlying static scene assumption and cause noisy signals during training; (iii) we demonstrate the efficacy of each component with a detailed ablation study and show high-quality depth estimation results in both KITTI and NYUv2 datasets. Moreover, thanks to the capability of scale-consistent prediction, we show that our monocular-trained deep networks are readily integrated into the ORB-SLAM2 system for more robust and accurate tracking. The proposed hybrid Pseudo-RGBD SLAM shows compelling results in KITTI, and it generalizes well to the KAIST dataset without additional training. Finally, we provide several demos for qualitative evaluation.

Results

Task	Dataset	Metric	Value	Model
Depth Estimation	KITTI Eigen split	Delta < 1.25	0.873	SC-Depth (ResNet 50)
Depth Estimation	KITTI Eigen split	Delta < 1.25^2	0.96	SC-Depth (ResNet 50)
Depth Estimation	KITTI Eigen split	Delta < 1.25^3	0.982	SC-Depth (ResNet 50)
Depth Estimation	KITTI Eigen split	RMSE	4.706	SC-Depth (ResNet 50)
Depth Estimation	KITTI Eigen split	RMSE log	0.191	SC-Depth (ResNet 50)
Depth Estimation	KITTI Eigen split	absolute relative error	0.114	SC-Depth (ResNet 50)
Depth Estimation	KITTI Eigen split	Delta < 1.25	0.863	SC-Depth (ResNet18)
Depth Estimation	KITTI Eigen split	Delta < 1.25^2	0.957	SC-Depth (ResNet18)
Depth Estimation	KITTI Eigen split	Delta < 1.25^3	0.981	SC-Depth (ResNet18)
Depth Estimation	KITTI Eigen split	RMSE	4.95	SC-Depth (ResNet18)
Depth Estimation	KITTI Eigen split	RMSE log	0.197	SC-Depth (ResNet18)
Depth Estimation	KITTI Eigen split	absolute relative error	0.119	SC-Depth (ResNet18)
Depth Estimation	NYU-Depth V2 self-supervised	Absolute relative error (AbsRel)	0.157	Bian et al
Depth Estimation	NYU-Depth V2 self-supervised	Root mean square error (RMSE)	0.593	Bian et al
Depth Estimation	NYU-Depth V2 self-supervised	delta_1	78	Bian et al
Depth Estimation	NYU-Depth V2 self-supervised	delta_2	94	Bian et al
Depth Estimation	NYU-Depth V2 self-supervised	delta_3	98.4	Bian et al
3D	KITTI Eigen split	Delta < 1.25	0.873	SC-Depth (ResNet 50)
3D	KITTI Eigen split	Delta < 1.25^2	0.96	SC-Depth (ResNet 50)
3D	KITTI Eigen split	Delta < 1.25^3	0.982	SC-Depth (ResNet 50)
3D	KITTI Eigen split	RMSE	4.706	SC-Depth (ResNet 50)
3D	KITTI Eigen split	RMSE log	0.191	SC-Depth (ResNet 50)
3D	KITTI Eigen split	absolute relative error	0.114	SC-Depth (ResNet 50)
3D	KITTI Eigen split	Delta < 1.25	0.863	SC-Depth (ResNet18)
3D	KITTI Eigen split	Delta < 1.25^2	0.957	SC-Depth (ResNet18)
3D	KITTI Eigen split	Delta < 1.25^3	0.981	SC-Depth (ResNet18)
3D	KITTI Eigen split	RMSE	4.95	SC-Depth (ResNet18)
3D	KITTI Eigen split	RMSE log	0.197	SC-Depth (ResNet18)
3D	KITTI Eigen split	absolute relative error	0.119	SC-Depth (ResNet18)
3D	NYU-Depth V2 self-supervised	Absolute relative error (AbsRel)	0.157	Bian et al
3D	NYU-Depth V2 self-supervised	Root mean square error (RMSE)	0.593	Bian et al
3D	NYU-Depth V2 self-supervised	delta_1	78	Bian et al
3D	NYU-Depth V2 self-supervised	delta_2	94	Bian et al
3D	NYU-Depth V2 self-supervised	delta_3	98.4	Bian et al

Unsupervised Scale-consistent Depth Learning from Video

Abstract

Results

Related Papers

Unsupervised Scale-consistent Depth Learning from Video

Abstract

Results

Related Papers