Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/VPD

VPD

Reported on 14 benchmarks across 4 tasks · 1 paper · 4 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision8 results

Depth EstimationonNYU-Depth V2
Delta < 1.25· 2023-03-03
0.964
best: 0.989 (UniK3D (FT, metric))
SOTA
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153
Depth EstimationonNYU-Depth V2
absolute relative error· 2023-03-03
0.069
best: 0.026 (HybridDepth)
SOTA
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153
Depth EstimationonNYU-Depth V2
Delta < 1.25^2· 2023-03-03
0.995
best: 1 (HybridDepth)
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153
Depth EstimationonNYU-Depth V2
Delta < 1.25^3· 2023-03-03
0.999
best: 1 (HybridDepth)
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153
Depth EstimationonNYU-Depth V2
RMSE· 2023-03-03
0.254
best: 0.013 (Defocus/DepthNet (Normalized))
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153
Depth EstimationonNYU-Depth V2
log 10· 2023-03-03
0.03
best: 0.059 (SC-DepthV2)
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153
Instance SegmentationonRefCoCo val
Overall IoU· 2023-03-03
73.25
best: 85.41 (DeRIS-L)
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153
Referring Expression SegmentationonRefCoCo val
Overall IoU· 2023-03-03
73.25
best: 85.41 (DeRIS-L)
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153

Methodology6 results

3DonNYU-Depth V2
Delta < 1.25· 2023-03-03
0.964
best: 0.989 (UniK3D (FT, metric))
SOTA
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153
3DonNYU-Depth V2
absolute relative error· 2023-03-03
0.069
best: 0.026 (HybridDepth)
SOTA
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153
3DonNYU-Depth V2
Delta < 1.25^2· 2023-03-03
0.995
best: 1 (HybridDepth)
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153
3DonNYU-Depth V2
Delta < 1.25^3· 2023-03-03
0.999
best: 1 (HybridDepth)
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153
3DonNYU-Depth V2
RMSE· 2023-03-03
0.254
best: 0.013 (Defocus/DepthNet (Normalized))
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153
3DonNYU-Depth V2
log 10· 2023-03-03
0.03
best: 0.059 (SC-DepthV2)
Unleashing Text-to-Image Diffusion Models for Visual Perception arXiv:2303.02153