Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/DFN-5B H/14-378 + PrefixedIter Decoder (FT2)

DFN-5B H/14-378 + PrefixedIter Decoder (FT2)

Reported on 8 benchmarks across 1 task · 1 paper · 3 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision8 results

Zero-Shot Image ClassificationonOVIC Datasets (Wiki-L)
Prediction Score (mean of 3)· 2024-07-15
74.88
SOTA
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion arXiv:2407.11211
Zero-Shot Image ClassificationonOVIC Datasets (Wiki-H)
Overall Score· 2024-07-15
79.02
SOTA
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion arXiv:2407.11211
Zero-Shot Image ClassificationonOVIC Datasets (Wiki-H)
Prediction Score· 2024-07-15
80.13
SOTA
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion arXiv:2407.11211
Zero-Shot Image ClassificationonOVIC Datasets (World-H)
Overall Score· 2024-07-15
87.13
best: 87.9 (DFN-5B H/14-378 + PrefixedIter Decoder (FT0))
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion arXiv:2407.11211
Zero-Shot Image ClassificationonOVIC Datasets (World-H)
Prediction Score· 2024-07-15
87.94
best: 88.27 (DFN-5B H/14-378 + PrefixedIter Decoder (FT0))
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion arXiv:2407.11211
Zero-Shot Image ClassificationonOVIC Datasets (World-H)
Prediction Score (mean of 3)· 2024-07-15
87.08
best: 87.49 (SigLIP SO/14 + PrefixedIter Decoder (FT2))
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion arXiv:2407.11211
Zero-Shot Image ClassificationonOVIC Datasets (World-H)
Top 1 Accuracy· 2024-07-15
86.77
best: 86.95 (DFN-5B H/14-378 + PrefixedIter Decoder (FT0))
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion arXiv:2407.11211
Zero-Shot Image ClassificationonOVIC Datasets (Wiki-H)
Top 1 Accuracy· 2024-07-15
77.05
best: 77.1 (DFN-5B H/14-378 + PrefixedIter Decoder (FT0))
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion arXiv:2407.11211