TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/DPT-Hybrid

DPT-Hybrid

Reported on 32 benchmarks across 4 tasks · 1 paper · 6 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision12 results

  • Depth EstimationonNYU-Depth V2
    Delta < 1.25^2· uses extra data· 2021-03-24
    0.988
    best: 1 (HybridDepth)
    SOTA
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Depth EstimationonNYU-Depth V2
    Delta < 1.25· uses extra data· 2021-03-24
    0.904
    best: 0.989 (UniK3D (FT, metric))
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Depth EstimationonNYU-Depth V2
    Delta < 1.25^3· uses extra data· 2021-03-24
    0.994
    best: 1 (HybridDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Depth EstimationonNYU-Depth V2
    RMSE· uses extra data· 2021-03-24
    0.357
    best: 0.013 (Defocus/DepthNet (Normalized))
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Depth EstimationonNYU-Depth V2
    absolute relative error· uses extra data· 2021-03-24
    0.11
    best: 0.026 (HybridDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Depth EstimationonNYU-Depth V2
    log 10· uses extra data· 2021-03-24
    0.045
    best: 0.059 (SC-DepthV2)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Depth EstimationonKITTI Eigen split
    Delta < 1.25· 2021-03-24
    0.959
    best: 0.99 (SPIDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Depth EstimationonKITTI Eigen split
    Delta < 1.25^2· 2021-03-24
    0.995
    best: 0.999 (SPIDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Depth EstimationonKITTI Eigen split
    Delta < 1.25^3· 2021-03-24
    0.999
    best: 1 (SPIDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Depth EstimationonKITTI Eigen split
    RMSE· 2021-03-24
    2.573
    best: 1.394 (SPIDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Depth EstimationonKITTI Eigen split
    RMSE log· 2021-03-24
    0.092
    best: 0.048 (SPIDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Depth EstimationonKITTI Eigen split
    absolute relative error· 2021-03-24
    0.062
    best: 0.029 (SPIDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413

Methodology12 results

  • 3DonNYU-Depth V2
    Delta < 1.25^2· uses extra data· 2021-03-24
    0.988
    best: 1 (HybridDepth)
    SOTA
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 3DonNYU-Depth V2
    Delta < 1.25· uses extra data· 2021-03-24
    0.904
    best: 0.989 (UniK3D (FT, metric))
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 3DonNYU-Depth V2
    Delta < 1.25^3· uses extra data· 2021-03-24
    0.994
    best: 1 (HybridDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 3DonNYU-Depth V2
    RMSE· uses extra data· 2021-03-24
    0.357
    best: 0.013 (Defocus/DepthNet (Normalized))
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 3DonNYU-Depth V2
    absolute relative error· uses extra data· 2021-03-24
    0.11
    best: 0.026 (HybridDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 3DonNYU-Depth V2
    log 10· uses extra data· 2021-03-24
    0.045
    best: 0.059 (SC-DepthV2)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 3DonKITTI Eigen split
    Delta < 1.25· 2021-03-24
    0.959
    best: 0.99 (SPIDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 3DonKITTI Eigen split
    Delta < 1.25^2· 2021-03-24
    0.995
    best: 0.999 (SPIDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 3DonKITTI Eigen split
    Delta < 1.25^3· 2021-03-24
    0.999
    best: 1 (SPIDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 3DonKITTI Eigen split
    RMSE· 2021-03-24
    2.573
    best: 1.394 (SPIDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 3DonKITTI Eigen split
    RMSE log· 2021-03-24
    0.092
    best: 0.048 (SPIDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 3DonKITTI Eigen split
    absolute relative error· 2021-03-24
    0.062
    best: 0.029 (SPIDepth)
    Vision Transformers for Dense PredictionarXiv:2103.13413

Medical4 results

  • Semantic SegmentationonADE20K val
    Pixel Accuracy· 2021-03-24
    83.11
    best: 83.43 (gSwin-S)
    SOTA
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Semantic SegmentationonADE20K val
    mIoU· 2021-03-24
    49.02
    best: 62.8 (BEiT-3)
    SOTA
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Semantic SegmentationonPASCAL Context
    mIoU· 2021-03-24
    60.46
    best: 71.1 (VPNeXt)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • Semantic SegmentationonADE20K
    Validation mIoU· 2021-03-24
    49.02
    best: 63.6 (ViT-P (InternImage-H))
    Vision Transformers for Dense PredictionarXiv:2103.13413

Audio4 results

  • 10-shot image generationonADE20K val
    Pixel Accuracy· 2021-03-24
    83.11
    best: 83.43 (gSwin-S)
    SOTA
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 10-shot image generationonADE20K val
    mIoU· 2021-03-24
    49.02
    best: 62.8 (BEiT-3)
    SOTA
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 10-shot image generationonPASCAL Context
    mIoU· 2021-03-24
    60.46
    best: 71.1 (VPNeXt)
    Vision Transformers for Dense PredictionarXiv:2103.13413
  • 10-shot image generationonADE20K
    Validation mIoU· 2021-03-24
    49.02
    best: 63.6 (ViT-P (InternImage-H))
    Vision Transformers for Dense PredictionarXiv:2103.13413