TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/DFN-5B H/14-378 + PrefixedIter Decoder (FT2)

DFN-5B H/14-378 + PrefixedIter Decoder (FT2)

Reported on 8 benchmarks across 1 task · 1 paper · 3 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision8 results

  • Zero-Shot Image ClassificationonOVIC Datasets (Wiki-L)
    Prediction Score (mean of 3)· 2024-07-15
    74.88
    SOTA
    Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP InversionarXiv:2407.11211
  • Zero-Shot Image ClassificationonOVIC Datasets (Wiki-H)
    Overall Score· 2024-07-15
    79.02
    SOTA
    Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP InversionarXiv:2407.11211
  • Zero-Shot Image ClassificationonOVIC Datasets (Wiki-H)
    Prediction Score· 2024-07-15
    80.13
    SOTA
    Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP InversionarXiv:2407.11211
  • Zero-Shot Image ClassificationonOVIC Datasets (World-H)
    Overall Score· 2024-07-15
    87.13
    best: 87.9 (DFN-5B H/14-378 + PrefixedIter Decoder (FT0))
    Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP InversionarXiv:2407.11211
  • Zero-Shot Image ClassificationonOVIC Datasets (World-H)
    Prediction Score· 2024-07-15
    87.94
    best: 88.27 (DFN-5B H/14-378 + PrefixedIter Decoder (FT0))
    Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP InversionarXiv:2407.11211
  • Zero-Shot Image ClassificationonOVIC Datasets (World-H)
    Prediction Score (mean of 3)· 2024-07-15
    87.08
    best: 87.49 (SigLIP SO/14 + PrefixedIter Decoder (FT2))
    Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP InversionarXiv:2407.11211
  • Zero-Shot Image ClassificationonOVIC Datasets (World-H)
    Top 1 Accuracy· 2024-07-15
    86.77
    best: 86.95 (DFN-5B H/14-378 + PrefixedIter Decoder (FT0))
    Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP InversionarXiv:2407.11211
  • Zero-Shot Image ClassificationonOVIC Datasets (Wiki-H)
    Top 1 Accuracy· 2024-07-15
    77.05
    best: 77.1 (DFN-5B H/14-378 + PrefixedIter Decoder (FT0))
    Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP InversionarXiv:2407.11211