TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Rethinking Depth Estimation for Multi-View Stereo: A Unifi...

Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation

Rui Peng, Rongjie Wang, Zhenyu Wang, Yawen Lai, Ronggang Wang

2022-01-05CVPR 2022 1regressionDepth Prediction3D ReconstructionDepth EstimationClassification
PaperPDFCode(official)

Abstract

Depth estimation is solved as a regression or classification problem in existing learning-based multi-view stereo methods. Although these two representations have recently demonstrated their excellent performance, they still have apparent shortcomings, e.g., regression methods tend to overfit due to the indirect learning cost volume, and classification methods cannot directly infer the exact depth due to its discrete prediction. In this paper, we propose a novel representation, termed Unification, to unify the advantages of regression and classification. It can directly constrain the cost volume like classification methods, but also realize the sub-pixel depth prediction like regression methods. To excavate the potential of unification, we design a new loss function named Unified Focal Loss, which is more uniform and reasonable to combat the challenge of sample imbalance. Combining these two unburdened modules, we present a coarse-to-fine framework, that we call UniMVSNet. The results of ranking first on both DTU and Tanks and Temples benchmarks verify that our model not only performs the best but also has the best generalization ability.

Results

TaskDatasetMetricValueModel
3D ReconstructionDTUAcc0.352UniMVSNet
3D ReconstructionDTUComp0.278UniMVSNet
3D ReconstructionDTUOverall0.315UniMVSNet
3DDTUAcc0.352UniMVSNet
3DDTUComp0.278UniMVSNet
3DDTUOverall0.315UniMVSNet

Related Papers

Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression2025-07-20AutoPartGen: Autogressive 3D Part Generation and Discovery2025-07-17$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Neural Network-Guided Symbolic Regression for Interpretable Descriptor Discovery in Perovskite Catalysts2025-07-16Imbalanced Regression Pipeline Recommendation2025-07-16Second-Order Bounds for [0,1]-Valued Regression via Betting Loss2025-07-16