Vilio: State-of-the-art Visio-Linguistic Models applied to Hateful Memes
Niklas Muennighoff
2020-12-14Meme Classification
Abstract
This work presents Vilio, an implementation of state-of-the-art visio-linguistic models and their application to the Hateful Memes Dataset. The implemented models have been fitted into a uniform code-base and altered to yield better performance. The goal of Vilio is to provide a user-friendly starting point for any visio-linguistic problem. An ensemble of 5 different V+L models implemented in Vilio achieves 2nd place in the Hateful Memes Challenge out of 3,300 participants. The code is available at https://github.com/Muennighoff/vilio.
Results
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Meme Classification | Hateful Memes | Accuracy | 0.695 | Vilio |
| Meme Classification | Hateful Memes | ROC-AUC | 0.825 | Vilio |
Related Papers
Detecting Harmful Memes with Decoupled Understanding and Guided CoT Reasoning2025-06-10LLM-based Semantic Augmentation for Harmful Content Detection2025-04-22Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection2025-02-18Demystifying Hateful Content: Leveraging Large Multimodal Models for Hateful Meme Detection with Explainable Decisions2025-02-16Figurative-cum-Commonsense Knowledge Infusion for Multimodal Mental Health Meme Classification2025-01-25Prompt-enhanced Network for Hateful Meme Classification2024-11-12MemeCLIP: Leveraging CLIP Representations for Multimodal Meme Classification2024-09-23What Makes a Meme a Meme? Identifying Memes for Memetics-Aware Dataset Creation2024-07-16