TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/CMCL

CMCL

Reported on 16 benchmarks across 3 tasks · 1 paper · 16 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing10 results

  • Image CaptioningonAIC-ICC
    BLEU· 2021-03-11
    66.1
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image CaptioningonAIC-ICC
    CIDEr· 2021-03-11
    220.7
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image CaptioningonAIC-ICC
    METEOR· 2021-03-11
    41.1
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image CaptioningonAIC-ICC
    ROUGE-L· 2021-03-11
    71.9
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image-to-Text RetrievalonAIC-ICC
    Recall@1· 2021-03-11
    20.3
    best: 33.7 (ERNIE-ViL2.0)
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image-to-Text RetrievalonAIC-ICC
    Recall@10· 2021-03-11
    45.6
    best: 60 (ERNIE-ViL2.0)
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image-to-Text RetrievalonAIC-ICC
    Recall@5· 2021-03-11
    37
    best: 52.1 (ERNIE-ViL2.0)
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image-to-Text RetrievalonRUC-CAS-WenLan
    Recall@1· 2021-03-11
    36.1
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image-to-Text RetrievalonRUC-CAS-WenLan
    Recall@10· 2021-03-11
    62.2
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image-to-Text RetrievalonRUC-CAS-WenLan
    Recall@5· 2021-03-11
    55.5
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561

Computer Vision6 results

  • Image RetrievalonAIC-ICC
    Recall@1· 2021-03-11
    14.4
    best: 19 (ERNIE-ViL2.0)
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image RetrievalonAIC-ICC
    Recall@10· 2021-03-11
    39.1
    best: 43.5 (ERNIE-ViL2.0)
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image RetrievalonAIC-ICC
    Recall@5· 2021-03-11
    39.1
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image RetrievalonRUC-CAS-WenLan
    Recall@1· 2021-03-11
    36
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image RetrievalonRUC-CAS-WenLan
    Recall@10· 2021-03-11
    62.1
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561
  • Image RetrievalonRUC-CAS-WenLan
    Recall@5· 2021-03-11
    55.4
    SOTA
    WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-TrainingarXiv:2103.06561