S. Mahdi H. Miangoleh, Sebastian Dille, Long Mai, Sylvain Paris, Yağız Aksoy
Neural networks have shown great abilities in estimating depth from a single image. However, the inferred depth maps are well below one-megapixel resolution and often lack fine-grained details, which limits their practicality. Our method builds on our analysis on how the input resolution and the scene structure affects depth estimation performance. We demonstrate that there is a trade-off between a consistent scene structure and the high-frequency details, and merge low- and high-resolution estimations to take advantage of this duality using a simple depth merging network. We present a double estimation method that improves the whole-image depth estimation and a patch selection method that adds local details to the final result. We demonstrate that by merging estimations at different resolutions with changing context, we can generate multi-megapixel depth maps with a high level of detail using a pre-trained model.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Depth Estimation | IBims-1 | D3R | 0.3222 | Miangoleh et al. (SGR) |
| Depth Estimation | IBims-1 | ORD | 0.3938 | Miangoleh et al. (SGR) |
| Depth Estimation | IBims-1 | RMSE | 0.1598 | Miangoleh et al. (SGR) |
| Depth Estimation | IBims-1 | δ1.25 | 0.639 | Miangoleh et al. (SGR) |
| Depth Estimation | IBims-1 | D3R | 0.4671 | Miangoleh et al. (MiDaS) |
| Depth Estimation | IBims-1 | ORD | 0.5538 | Miangoleh et al. (MiDaS) |
| Depth Estimation | IBims-1 | RMSE | 0.1965 | Miangoleh et al. (MiDaS) |
| Depth Estimation | IBims-1 | δ1.25 | 0.746 | Miangoleh et al. (MiDaS) |
| Depth Estimation | Middlebury 2014 | D3R | 0.1578 | Miangoleh et al. (MiDaS) |
| Depth Estimation | Middlebury 2014 | ORD | 0.3467 | Miangoleh et al. (MiDaS) |
| Depth Estimation | Middlebury 2014 | RMSE | 0.1557 | Miangoleh et al. (MiDaS) |
| Depth Estimation | Middlebury 2014 | δ1.25 | 0.7406 | Miangoleh et al. (MiDaS) |
| Depth Estimation | Middlebury 2014 | D3R | 0.2324 | Miangoleh et al. (SGR) |
| Depth Estimation | Middlebury 2014 | ORD | 0.3879 | Miangoleh et al. (SGR) |
| Depth Estimation | Middlebury 2014 | RMSE | 0.1973 | Miangoleh et al. (SGR) |
| Depth Estimation | Middlebury 2014 | δ1.25 | 0.7891 | Miangoleh et al. (SGR) |
| 3D | IBims-1 | D3R | 0.3222 | Miangoleh et al. (SGR) |
| 3D | IBims-1 | ORD | 0.3938 | Miangoleh et al. (SGR) |
| 3D | IBims-1 | RMSE | 0.1598 | Miangoleh et al. (SGR) |
| 3D | IBims-1 | δ1.25 | 0.639 | Miangoleh et al. (SGR) |
| 3D | IBims-1 | D3R | 0.4671 | Miangoleh et al. (MiDaS) |
| 3D | IBims-1 | ORD | 0.5538 | Miangoleh et al. (MiDaS) |
| 3D | IBims-1 | RMSE | 0.1965 | Miangoleh et al. (MiDaS) |
| 3D | IBims-1 | δ1.25 | 0.746 | Miangoleh et al. (MiDaS) |
| 3D | Middlebury 2014 | D3R | 0.1578 | Miangoleh et al. (MiDaS) |
| 3D | Middlebury 2014 | ORD | 0.3467 | Miangoleh et al. (MiDaS) |
| 3D | Middlebury 2014 | RMSE | 0.1557 | Miangoleh et al. (MiDaS) |
| 3D | Middlebury 2014 | δ1.25 | 0.7406 | Miangoleh et al. (MiDaS) |
| 3D | Middlebury 2014 | D3R | 0.2324 | Miangoleh et al. (SGR) |
| 3D | Middlebury 2014 | ORD | 0.3879 | Miangoleh et al. (SGR) |
| 3D | Middlebury 2014 | RMSE | 0.1973 | Miangoleh et al. (SGR) |
| 3D | Middlebury 2014 | δ1.25 | 0.7891 | Miangoleh et al. (SGR) |