TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/SCI

SCI

Self-Contradictory Instructions

ImagesTextsIntroduced 2024-08-02

Large multimodal models (LMMs) excel in adhering to human instructions. However, self-contradictory instructions may arise due to the increasing trend of multimodal interaction and context length, which is challenging for language beginners and vulnerable populations. We introduce the Self-Contradictory Instructions benchmark to evaluate the capability of LMMs in recognizing conflicting commands. It comprises 20,000 conflicts, evenly distributed between language and vision paradigms. It is constructed by a novel automatic dataset creation framework, which expedites the process and enables us to encompass a wide range of instruction forms. Our comprehensive evaluation reveals current LMMs consistently struggle to identify multimodal instruction discordance due to a lack of self-awareness. Hence, we propose the Cognitive Awakening Prompting to inject cognition from external, largely enhancing dissonance detection.

Related Benchmarks

SCICAP/Image Captioning/BLEU-4SciCite/Classification/F1SciCite/Classification/Macro-F1SciCite/Sentence Classification/F1SciCite/Text Classification/F1SciCite/Text Classification/Macro-F1SciDocs/Representation Learning/Avg.SciDocs/Retrieval/nDCG@10SciDocs (MAG)/Classification/F1 (micro)SciDocs (MAG)/Document Classification/F1 (micro)SciDocs (MAG)/Text Classification/F1 (micro)SciDocs (MeSH)/Classification/F1 (micro)SciDocs (MeSH)/Document Classification/F1 (micro)SciDocs (MeSH)/Text Classification/F1 (micro)SciERC/Continual Pretraining/F1 (macro)SciERC/Few-Shot Learning/F1 (1-Doc)SciERC/Few-Shot Learning/F1 (3-Doc)SciERC/Image Enhancement/F1 scoreSciERC/Information Extraction/Cross SentenceSciERC/Information Extraction/Entity F1SciERC/Information Extraction/RE+ Micro F1SciERC/Information Extraction/Relation F1SciERC/Meta-Learning/F1 (1-Doc)SciERC/Meta-Learning/F1 (3-Doc)SciERC/Named Entity Recognition (NER)/F1SciERC/Relation Classification/F1 (1-Doc)SciERC/Relation Classification/F1 (3-Doc)SciERC/Relation Extraction/Cross SentenceSciERC/Relation Extraction/Entity F1SciERC/Relation Extraction/F1SciERC/Relation Extraction/F1 (1-Doc)SciERC/Relation Extraction/F1 (3-Doc)SciERC/Relation Extraction/NER Micro F1SciERC/Relation Extraction/RE+ Micro F1SciERC/Relation Extraction/Relation F1SciFact/Retrieval/nDCG@10SciFact (BEIR)/Fact Checking/nDCG@10SciQ/Text Generation/AccuracySciREX/Relation Extraction/Avg. F1SciTail/Natural Language Inference/% Dev AccuracySciTail/Natural Language Inference/% Test AccuracySciTail/Natural Language Inference/AccuracySciTail/Natural Language Inference/Dev AccuracyScienceCite/Classification/F1ScienceCite/Sentence Classification/F1ScienceCite/Text Classification/F1ScienceQA/Question Answering/Avg. AccuracyScienceQA/Question Answering/Grades 1-6ScienceQA/Question Answering/Grades 7-12ScienceQA/Question Answering/Image ContextScienceQA/Question Answering/Language ScienceScienceQA/Question Answering/Natural ScienceScienceQA/Question Answering/No ContextScienceQA/Question Answering/Social ScienceScienceQA/Question Answering/Text ContextsciERC-sent/Relation Extraction/F1

Statistics

Papers
1
Benchmarks
0

Links

Homepage