Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Long Mai, Simon Chen, Chunhua Shen
Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length. We investigate this problem in detail, and propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image, and then use 3D point cloud encoders to predict the missing depth shift and focal length that allow us to recover a realistic 3D scene shape. In addition, we propose an image-level normalized regression loss and a normal-based geometry loss to enhance depth prediction models trained on mixed datasets. We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot dataset generalization. Code is available at: https://git.io/Depth
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Depth Estimation | ScanNetV2 | absolute relative error | 0.095 | LeReS |
| Depth Estimation | DIODE | Delta < 1.25 | 0.234 | LeRes |
| Depth Estimation | NYU-Depth V2 | Delta < 1.25 | 0.916 | LeReS |
| Depth Estimation | NYU-Depth V2 | absolute relative error | 0.09 | LeReS |
| Depth Estimation | ETH3D | Delta < 1.25 | 0.0777 | LeReS |
| Depth Estimation | ETH3D | absolute relative error | 0.0171 | LeReS |
| Depth Estimation | KITTI Eigen split | Delta < 1.25 | 0.784 | LeReS |
| Depth Estimation | KITTI Eigen split | absolute relative error | 0.149 | LeReS |
| Depth Estimation | DIODE | Delta < 1.25^3 | 0.9 | LeReS |
| 3D | ScanNetV2 | absolute relative error | 0.095 | LeReS |
| 3D | DIODE | Delta < 1.25 | 0.234 | LeRes |
| 3D | NYU-Depth V2 | Delta < 1.25 | 0.916 | LeReS |
| 3D | NYU-Depth V2 | absolute relative error | 0.09 | LeReS |
| 3D | ETH3D | Delta < 1.25 | 0.0777 | LeReS |
| 3D | ETH3D | absolute relative error | 0.0171 | LeReS |
| 3D | KITTI Eigen split | Delta < 1.25 | 0.784 | LeReS |
| 3D | KITTI Eigen split | absolute relative error | 0.149 | LeReS |
| 3D | DIODE | Delta < 1.25^3 | 0.9 | LeReS |