TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/NeW CRFs: Neural Window Fully-connected CRFs for Monocular...

NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation

Weihao Yuan, Xiaodong Gu, Zuozhuo Dai, Siyu Zhu, Ping Tan

2022-03-03CVPR 2022 3Depth PredictionDepth EstimationMonocular Depth Estimation
PaperPDFCode(official)

Abstract

Estimating the accurate depth from a single image is challenging since it is inherently ambiguous and ill-posed. While recent works design increasingly complicated and powerful networks to directly regress the depth map, we take the path of CRFs optimization. Due to the expensive computation, CRFs are usually performed between neighborhoods rather than the whole graph. To leverage the potential of fully-connected CRFs, we split the input into windows and perform the FC-CRFs optimization within each window, which reduces the computation complexity and makes FC-CRFs feasible. To better capture the relationships between nodes in the graph, we exploit the multi-head attention mechanism to compute a multi-head potential function, which is fed to the networks to output an optimized depth map. Then we build a bottom-up-top-down structure, where this neural window FC-CRFs module serves as the decoder, and a vision transformer serves as the encoder. The experiments demonstrate that our method significantly improves the performance across all metrics on both the KITTI and NYUv2 datasets, compared to previous methods. Furthermore, the proposed method can be directly applied to panorama images and outperforms all previous panorama methods on the MatterPort3D dataset. Project page: https://weihaosky.github.io/newcrfs.

Results

TaskDatasetMetricValueModel
Depth EstimationNYU-Depth V2Delta < 1.250.922NeWCRFs
Depth EstimationNYU-Depth V2Delta < 1.25^20.992NeWCRFs
Depth EstimationNYU-Depth V2Delta < 1.25^30.998NeWCRFs
Depth EstimationNYU-Depth V2RMSE0.334NeWCRFs
Depth EstimationNYU-Depth V2absolute relative error0.095NeWCRFs
Depth EstimationNYU-Depth V2log 100.041NeWCRFs
Depth EstimationMatterport3DDelta < 1.250.9376NeWCRFs
Depth EstimationMatterport3DDelta < 1.25^20.9812NeWCRFs
Depth EstimationMatterport3DDelta < 1.25^30.9933NeWCRFs
Depth EstimationMatterport3DRMSE0.4279NeWCRFs
Depth EstimationMatterport3Dabsolute error0.197NeWCRFs
Depth EstimationMatterport3Dabsolute relative error0.0793NeWCRFs
Depth EstimationKITTI Eigen splitDelta < 1.250.974NeWCRFs
Depth EstimationKITTI Eigen splitDelta < 1.25^20.997NeWCRFs
Depth EstimationKITTI Eigen splitDelta < 1.25^30.999NeWCRFs
Depth EstimationKITTI Eigen splitRMSE2.129NeWCRFs
Depth EstimationKITTI Eigen splitRMSE log0.079NeWCRFs
Depth EstimationKITTI Eigen splitSq Rel0.155NeWCRFs
Depth EstimationKITTI Eigen splitabsolute relative error0.052NeWCRFs
3DNYU-Depth V2Delta < 1.250.922NeWCRFs
3DNYU-Depth V2Delta < 1.25^20.992NeWCRFs
3DNYU-Depth V2Delta < 1.25^30.998NeWCRFs
3DNYU-Depth V2RMSE0.334NeWCRFs
3DNYU-Depth V2absolute relative error0.095NeWCRFs
3DNYU-Depth V2log 100.041NeWCRFs
3DMatterport3DDelta < 1.250.9376NeWCRFs
3DMatterport3DDelta < 1.25^20.9812NeWCRFs
3DMatterport3DDelta < 1.25^30.9933NeWCRFs
3DMatterport3DRMSE0.4279NeWCRFs
3DMatterport3Dabsolute error0.197NeWCRFs
3DMatterport3Dabsolute relative error0.0793NeWCRFs
3DKITTI Eigen splitDelta < 1.250.974NeWCRFs
3DKITTI Eigen splitDelta < 1.25^20.997NeWCRFs
3DKITTI Eigen splitDelta < 1.25^30.999NeWCRFs
3DKITTI Eigen splitRMSE2.129NeWCRFs
3DKITTI Eigen splitRMSE log0.079NeWCRFs
3DKITTI Eigen splitSq Rel0.155NeWCRFs
3DKITTI Eigen splitabsolute relative error0.052NeWCRFs

Related Papers

$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network2025-07-15Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation2025-07-15Cameras as Relative Positional Encoding2025-07-14ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way2025-07-11