FutureDepth: Learning to Predict the Future Improves Video Depth Estimation

Rajeev Yasarla, Manish Kumar Singh, Hong Cai, Yunxiao Shi, Jisoo Jeong, Yinhao Zhu, Shizhong Han, Risheek Garrepalli, Fatih Porikli

2024-03-19Future prediction Depth Estimation Monocular Depth Estimation

Paper PDF

Abstract

In this paper, we propose a novel video depth estimation approach, FutureDepth, which enables the model to implicitly leverage multi-frame and motion cues to improve depth estimation by making it learn to predict the future at training. More specifically, we propose a future prediction network, F-Net, which takes the features of multiple consecutive frames and is trained to predict multi-frame features one time step ahead iteratively. In this way, F-Net learns the underlying motion and correspondence information, and we incorporate its features into the depth decoding process. Additionally, to enrich the learning of multiframe correspondence cues, we further leverage a reconstruction network, R-Net, which is trained via adaptively masked auto-encoding of multiframe feature volumes. At inference time, both F-Net and R-Net are used to produce queries to work with the depth decoder, as well as a final refinement network. Through extensive experiments on several benchmarks, i.e., NYUDv2, KITTI, DDAD, and Sintel, which cover indoor, driving, and open-domain scenarios, we show that FutureDepth significantly improves upon baseline models, outperforms existing video depth estimation methods, and sets new state-of-the-art (SOTA) accuracy. Furthermore, FutureDepth is more efficient than existing SOTA video depth estimation models and has similar latencies when comparing to monocular models

Results

Task	Dataset	Metric	Value	Model
Depth Estimation	NYU-Depth V2	Delta < 1.25	0.981	FutureDepth
Depth Estimation	NYU-Depth V2	Delta < 1.25^2	0.996	FutureDepth
Depth Estimation	NYU-Depth V2	Delta < 1.25^3	0.999	FutureDepth
Depth Estimation	NYU-Depth V2	RMSE	0.233	FutureDepth
Depth Estimation	NYU-Depth V2	absolute relative error	0.063	FutureDepth
Depth Estimation	NYU-Depth V2	log 10	0.027	FutureDepth
Depth Estimation	KITTI Eigen split	Delta < 1.25	0.984	FutureDepth
Depth Estimation	KITTI Eigen split	Delta < 1.25^2	0.998	FutureDepth
Depth Estimation	KITTI Eigen split	Delta < 1.25^3	1	FutureDepth
Depth Estimation	KITTI Eigen split	RMSE	1.856	FutureDepth
Depth Estimation	KITTI Eigen split	RMSE log	0.066	FutureDepth
Depth Estimation	KITTI Eigen split	Sq Rel	0.117	FutureDepth
Depth Estimation	KITTI Eigen split	Square relative error (SqRel)	0.117	FutureDepth
Depth Estimation	KITTI Eigen split	absolute relative error	0.041	FutureDepth
3D	NYU-Depth V2	Delta < 1.25	0.981	FutureDepth
3D	NYU-Depth V2	Delta < 1.25^2	0.996	FutureDepth
3D	NYU-Depth V2	Delta < 1.25^3	0.999	FutureDepth
3D	NYU-Depth V2	RMSE	0.233	FutureDepth
3D	NYU-Depth V2	absolute relative error	0.063	FutureDepth
3D	NYU-Depth V2	log 10	0.027	FutureDepth
3D	KITTI Eigen split	Delta < 1.25	0.984	FutureDepth
3D	KITTI Eigen split	Delta < 1.25^2	0.998	FutureDepth
3D	KITTI Eigen split	Delta < 1.25^3	1	FutureDepth
3D	KITTI Eigen split	RMSE	1.856	FutureDepth
3D	KITTI Eigen split	RMSE log	0.066	FutureDepth
3D	KITTI Eigen split	Sq Rel	0.117	FutureDepth
3D	KITTI Eigen split	Square relative error (SqRel)	0.117	FutureDepth
3D	KITTI Eigen split	absolute relative error	0.041	FutureDepth

FutureDepth: Learning to Predict the Future Improves Video Depth Estimation

Abstract

Results

Related Papers

FutureDepth: Learning to Predict the Future Improves Video Depth Estimation

Abstract

Results

Related Papers