RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching

Lahav Lipson, Zachary Teed, Jia Deng

2021-09-15Stereo Matching Stereo Depth Estimation Optical Flow Estimation Stereo Disparity Estimation

Abstract

We introduce RAFT-Stereo, a new deep architecture for rectified stereo based on the optical flow network RAFT. We introduce multi-level convolutional GRUs, which more efficiently propagate information across the image. A modified version of RAFT-Stereo can perform accurate real-time inference. RAFT-stereo ranks first on the Middlebury leaderboard, outperforming the next best method on 1px error by 29% and outperforms all published work on the ETH3D two-view stereo benchmark. Code is available at https://github.com/princeton-vl/RAFT-Stereo.

Results

Task	Dataset	Metric	Value	Model
Depth Estimation	Spring	1px total	15.273	RAFT-Stereo
3D	Spring	1px total	15.273	RAFT-Stereo
Stereo Disparity Estimation	Middlebury 2014	D1 Error (2px)	4.74	RAFT-Stereo
Stereo Depth Estimation	Spring	1px total	15.273	RAFT-Stereo

Related Papers

$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17 Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17 Cameras as Relative Positional Encoding2025-07-14 An Efficient Approach for Muscle Segmentation and 3D Reconstruction Using Keypoint Tracking in MRI Scan2025-07-11 Learning to Track Any Points from Human Motion2025-07-08 Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts2025-07-07 TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation2025-07-07 RobuSTereo: Robust Zero-Shot Stereo Matching under Adverse Weather2025-07-02