Yasunori Ishii, Takayoshi Yamashita
It is difficult to collect data on a large scale in a monocular depth estimation because the task requires the simultaneous acquisition of RGB images and depths. Data augmentation is thus important to this task. However, there has been little research on data augmentation for tasks such as monocular depth estimation, where the transformation is performed pixel by pixel. In this paper, we propose a data augmentation method, called CutDepth. In CutDepth, part of the depth is pasted onto an input image during training. The method extends variations data without destroying edge features. Experiments objectively and subjectively show that the proposed method outperforms conventional methods of data augmentation. The estimation accuracy is improved with CutDepth even though there are few training data at long distances.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Depth Estimation | NYU-Depth V2 | Delta < 1.25 | 0.899 | CutDepth |
| Depth Estimation | NYU-Depth V2 | Delta < 1.25^2 | 0.985 | CutDepth |
| Depth Estimation | NYU-Depth V2 | Delta < 1.25^3 | 0.997 | CutDepth |
| Depth Estimation | NYU-Depth V2 | RMSE | 0.375 | CutDepth |
| Depth Estimation | NYU-Depth V2 | absolute relative error | 0.104 | CutDepth |
| Depth Estimation | NYU-Depth V2 | log 10 | 0.044 | CutDepth |
| 3D | NYU-Depth V2 | Delta < 1.25 | 0.899 | CutDepth |
| 3D | NYU-Depth V2 | Delta < 1.25^2 | 0.985 | CutDepth |
| 3D | NYU-Depth V2 | Delta < 1.25^3 | 0.997 | CutDepth |
| 3D | NYU-Depth V2 | RMSE | 0.375 | CutDepth |
| 3D | NYU-Depth V2 | absolute relative error | 0.104 | CutDepth |
| 3D | NYU-Depth V2 | log 10 | 0.044 | CutDepth |