TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/VPD

VPD

Reported on 14 benchmarks across 4 tasks · 1 paper · 4 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision8 results

  • Depth EstimationonNYU-Depth V2
    Delta < 1.25· 2023-03-03
    0.964
    best: 0.989 (UniK3D (FT, metric))
    SOTA
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153
  • Depth EstimationonNYU-Depth V2
    absolute relative error· 2023-03-03
    0.069
    best: 0.026 (HybridDepth)
    SOTA
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153
  • Depth EstimationonNYU-Depth V2
    Delta < 1.25^2· 2023-03-03
    0.995
    best: 1 (HybridDepth)
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153
  • Depth EstimationonNYU-Depth V2
    Delta < 1.25^3· 2023-03-03
    0.999
    best: 1 (HybridDepth)
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153
  • Depth EstimationonNYU-Depth V2
    RMSE· 2023-03-03
    0.254
    best: 0.013 (Defocus/DepthNet (Normalized))
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153
  • Depth EstimationonNYU-Depth V2
    log 10· 2023-03-03
    0.03
    best: 0.059 (SC-DepthV2)
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153
  • Instance SegmentationonRefCoCo val
    Overall IoU· 2023-03-03
    73.25
    best: 85.41 (DeRIS-L)
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153
  • Referring Expression SegmentationonRefCoCo val
    Overall IoU· 2023-03-03
    73.25
    best: 85.41 (DeRIS-L)
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153

Methodology6 results

  • 3DonNYU-Depth V2
    Delta < 1.25· 2023-03-03
    0.964
    best: 0.989 (UniK3D (FT, metric))
    SOTA
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153
  • 3DonNYU-Depth V2
    absolute relative error· 2023-03-03
    0.069
    best: 0.026 (HybridDepth)
    SOTA
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153
  • 3DonNYU-Depth V2
    Delta < 1.25^2· 2023-03-03
    0.995
    best: 1 (HybridDepth)
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153
  • 3DonNYU-Depth V2
    Delta < 1.25^3· 2023-03-03
    0.999
    best: 1 (HybridDepth)
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153
  • 3DonNYU-Depth V2
    RMSE· 2023-03-03
    0.254
    best: 0.013 (Defocus/DepthNet (Normalized))
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153
  • 3DonNYU-Depth V2
    log 10· 2023-03-03
    0.03
    best: 0.059 (SC-DepthV2)
    Unleashing Text-to-Image Diffusion Models for Visual PerceptionarXiv:2303.02153