TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/ViT-model

ViT-model

Reported on 28 benchmarks across 4 tasks · 1 paper

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing21 results

  • Data-to-Text GenerationonVIST
    BLEU-1· 2022-10-06
    63
    best: 69 (AOG + ARS)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Data-to-Text GenerationonVIST
    BLEU-2· 2022-10-06
    37.5
    best: 44 (AOG + ARS)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Data-to-Text GenerationonVIST
    BLEU-3· 2022-10-06
    21.5
    best: 25.3 (CoVS)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Data-to-Text GenerationonVIST
    BLEU-4· 2022-10-06
    12.3
    best: 16.7 (HEGR)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Data-to-Text GenerationonVIST
    CIDEr· 2022-10-06
    4.4
    best: 14.1 (HEGR)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Data-to-Text GenerationonVIST
    METEOR· 2022-10-06
    35.4
    best: 37.8 (HEGR)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Data-to-Text GenerationonVIST
    ROUGE-L· 2022-10-06
    31
    best: 33.1 (TAPM)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Visual StorytellingonVIST
    BLEU-1· 2022-10-06
    63
    best: 69 (AOG + ARS)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Visual StorytellingonVIST
    BLEU-2· 2022-10-06
    37.5
    best: 44 (AOG + ARS)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Visual StorytellingonVIST
    BLEU-3· 2022-10-06
    21.5
    best: 25.3 (CoVS)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Visual StorytellingonVIST
    BLEU-4· 2022-10-06
    12.3
    best: 16.7 (HEGR)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Visual StorytellingonVIST
    CIDEr· 2022-10-06
    4.4
    best: 14.1 (HEGR)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Visual StorytellingonVIST
    METEOR· 2022-10-06
    35.4
    best: 37.8 (HEGR)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Visual StorytellingonVIST
    ROUGE-L· 2022-10-06
    31
    best: 33.1 (TAPM)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Story GenerationonVIST
    BLEU-1· 2022-10-06
    63
    best: 69 (AOG + ARS)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Story GenerationonVIST
    BLEU-2· 2022-10-06
    37.5
    best: 44 (AOG + ARS)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Story GenerationonVIST
    BLEU-3· 2022-10-06
    21.5
    best: 25.3 (CoVS)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Story GenerationonVIST
    BLEU-4· 2022-10-06
    12.3
    best: 16.7 (HEGR)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Story GenerationonVIST
    CIDEr· 2022-10-06
    4.4
    best: 14.1 (HEGR)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Story GenerationonVIST
    METEOR· 2022-10-06
    35.4
    best: 37.8 (HEGR)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Story GenerationonVIST
    ROUGE-L· 2022-10-06
    31
    best: 33.1 (TAPM)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762

Adversarial7 results

  • Text GenerationonVIST
    BLEU-1· 2022-10-06
    63
    best: 69 (AOG + ARS)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Text GenerationonVIST
    BLEU-2· 2022-10-06
    37.5
    best: 44 (AOG + ARS)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Text GenerationonVIST
    BLEU-3· 2022-10-06
    21.5
    best: 25.3 (CoVS)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Text GenerationonVIST
    BLEU-4· 2022-10-06
    12.3
    best: 16.7 (HEGR)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Text GenerationonVIST
    CIDEr· 2022-10-06
    4.4
    best: 14.1 (HEGR)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Text GenerationonVIST
    METEOR· 2022-10-06
    35.4
    best: 37.8 (HEGR)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762
  • Text GenerationonVIST
    ROUGE-L· 2022-10-06
    31
    best: 33.1 (TAPM)
    Vision Transformer Based Model for Describing a Set of Images as a StoryarXiv:2210.02762