Ziyue Feng, Liang Yang, Longlong Jing, HaiYan Wang, YingLi Tian, Bing Li
Conventional self-supervised monocular depth prediction methods are based on a static environment assumption, which leads to accuracy degradation in dynamic scenes due to the mismatch and occlusion problems introduced by object motions. Existing dynamic-object-focused methods only partially solved the mismatch problem at the training loss level. In this paper, we accordingly propose a novel multi-frame monocular depth prediction method to solve these problems at both the prediction and supervision loss levels. Our method, called DynamicDepth, is a new framework trained via a self-supervised cycle consistent learning scheme. A Dynamic Object Motion Disentanglement (DOMD) module is proposed to disentangle object motions to solve the mismatch problem. Moreover, novel occlusion-aware Cost Volume and Re-projection Loss are designed to alleviate the occlusion effects of object motions. Extensive analyses and experiments on the Cityscapes and KITTI datasets show that our method significantly outperforms the state-of-the-art monocular depth prediction methods, especially in the areas of dynamic objects. Code is available at https://github.com/AutoAILab/DynamicDepth
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Depth Estimation | KITTI Eigen split unsupervised | Delta < 1.25 | 0.897 | DynamicDepth (M+640x192) |
| Depth Estimation | KITTI Eigen split unsupervised | Delta < 1.25^2 | 0.964 | DynamicDepth (M+640x192) |
| Depth Estimation | KITTI Eigen split unsupervised | Delta < 1.25^3 | 0.984 | DynamicDepth (M+640x192) |
| Depth Estimation | KITTI Eigen split unsupervised | RMSE | 4.458 | DynamicDepth (M+640x192) |
| Depth Estimation | KITTI Eigen split unsupervised | RMSE log | 0.175 | DynamicDepth (M+640x192) |
| Depth Estimation | KITTI Eigen split unsupervised | Sq Rel | 0.72 | DynamicDepth (M+640x192) |
| Depth Estimation | KITTI Eigen split unsupervised | absolute relative error | 0.096 | DynamicDepth (M+640x192) |
| 3D | KITTI Eigen split unsupervised | Delta < 1.25 | 0.897 | DynamicDepth (M+640x192) |
| 3D | KITTI Eigen split unsupervised | Delta < 1.25^2 | 0.964 | DynamicDepth (M+640x192) |
| 3D | KITTI Eigen split unsupervised | Delta < 1.25^3 | 0.984 | DynamicDepth (M+640x192) |
| 3D | KITTI Eigen split unsupervised | RMSE | 4.458 | DynamicDepth (M+640x192) |
| 3D | KITTI Eigen split unsupervised | RMSE log | 0.175 | DynamicDepth (M+640x192) |
| 3D | KITTI Eigen split unsupervised | Sq Rel | 0.72 | DynamicDepth (M+640x192) |
| 3D | KITTI Eigen split unsupervised | absolute relative error | 0.096 | DynamicDepth (M+640x192) |