TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Depth Anything: Unleashing the Power of Large-Scale Unlabe...

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao

2024-01-19CVPR 2024 1Data AugmentationSemantic SegmentationDepth EstimationMonocular Depth Estimation
PaperPDFCodeCodeCodeCodeCodeCodeCode(official)

Abstract

This work presents Depth Anything, a highly practical solution for robust monocular depth estimation. Without pursuing novel technical modules, we aim to build a simple yet powerful foundation model dealing with any images under any circumstances. To this end, we scale up the dataset by designing a data engine to collect and automatically annotate large-scale unlabeled data (~62M), which significantly enlarges the data coverage and thus is able to reduce the generalization error. We investigate two simple yet effective strategies that make data scaling-up promising. First, a more challenging optimization target is created by leveraging data augmentation tools. It compels the model to actively seek extra visual knowledge and acquire robust representations. Second, an auxiliary supervision is developed to enforce the model to inherit rich semantic priors from pre-trained encoders. We evaluate its zero-shot capabilities extensively, including six public datasets and randomly captured photos. It demonstrates impressive generalization ability. Further, through fine-tuning it with metric depth information from NYUv2 and KITTI, new SOTAs are set. Our better depth model also results in a better depth-conditioned ControlNet. Our models are released at https://github.com/LiheYoung/Depth-Anything.

Results

TaskDatasetMetricValueModel
Depth EstimationNYU-Depth V2Delta < 1.250.984Depth Anything
Depth EstimationNYU-Depth V2Delta < 1.25^20.998Depth Anything
Depth EstimationNYU-Depth V2Delta < 1.25^31Depth Anything
Depth EstimationNYU-Depth V2RMSE0.206Depth Anything
Depth EstimationNYU-Depth V2absolute relative error0.056Depth Anything
Depth EstimationNYU-Depth V2log 100.024Depth Anything
Depth EstimationETH3DDelta < 1.250.882Depth Anything
Depth EstimationETH3Dabsolute relative error0.0127Depth Anything
Depth EstimationKITTI Eigen splitDelta < 1.250.982Depth Anything
Depth EstimationKITTI Eigen splitDelta < 1.25^20.998Depth Anything
Depth EstimationKITTI Eigen splitDelta < 1.25^31Depth Anything
Depth EstimationKITTI Eigen splitRMSE1.896Depth Anything
Depth EstimationKITTI Eigen splitRMSE log0.069Depth Anything
Depth EstimationKITTI Eigen splitSq Rel0.121Depth Anything
Depth EstimationKITTI Eigen splitabsolute relative error0.046Depth Anything
Semantic SegmentationCityscapes valmIoU86.2Depth Anything
3DNYU-Depth V2Delta < 1.250.984Depth Anything
3DNYU-Depth V2Delta < 1.25^20.998Depth Anything
3DNYU-Depth V2Delta < 1.25^31Depth Anything
3DNYU-Depth V2RMSE0.206Depth Anything
3DNYU-Depth V2absolute relative error0.056Depth Anything
3DNYU-Depth V2log 100.024Depth Anything
3DETH3DDelta < 1.250.882Depth Anything
3DETH3Dabsolute relative error0.0127Depth Anything
3DKITTI Eigen splitDelta < 1.250.982Depth Anything
3DKITTI Eigen splitDelta < 1.25^20.998Depth Anything
3DKITTI Eigen splitDelta < 1.25^31Depth Anything
3DKITTI Eigen splitRMSE1.896Depth Anything
3DKITTI Eigen splitRMSE log0.069Depth Anything
3DKITTI Eigen splitSq Rel0.121Depth Anything
3DKITTI Eigen splitabsolute relative error0.046Depth Anything
10-shot image generationCityscapes valmIoU86.2Depth Anything

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17