The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes

Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, Davide Testuggine

2020-05-10NeurIPS 2020 12Binary Classification General Classification Classification Meme Classification

Abstract

This work proposes a new challenge set for multimodal classification, focusing on detecting hate speech in multimodal memes. It is constructed such that unimodal models struggle and only multimodal models can succeed: difficult examples ("benign confounders") are added to the dataset to make it hard to rely on unimodal signals. The task requires subtle reasoning, yet is straightforward to evaluate as a binary classification problem. We provide baseline performance numbers for unimodal models, as well as for multimodal models with various degrees of sophistication. We find that state-of-the-art methods perform poorly compared to humans (64.73% vs. 84.7% accuracy), illustrating the difficulty of the task and highlighting the challenge that this important problem poses to the community.

Results

Task	Dataset	Metric	Value	Model
Meme Classification	Hateful Memes	Accuracy	0.847	Human
Meme Classification	Hateful Memes	ROC-AUC	0.8265	Human
Meme Classification	Hateful Memes	Accuracy	0.695	Visual BERT COCO
Meme Classification	Hateful Memes	ROC-AUC	0.754	Visual BERT COCO

Related Papers

Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17 Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16 Safeguarding Federated Learning-based Road Condition Classification2025-07-16 AI-Enhanced Pediatric Pneumonia Detection: A CNN-Based Approach Using Data Augmentation and Generative Adversarial Networks (GANs)2025-07-13 An Automated Classifier of Harmful Brain Activities for Clinical Usage Based on a Vision-Inspired Pre-trained Framework2025-07-10 Fuzzy Classification Aggregation for a Continuum of Agents2025-07-06 Hybrid-View Attention for csPCa Classification in TRUS2025-07-04 DDL: A Dataset for Interpretable Deepfake Detection and Localization in Real-World Scenarios2025-06-29