Metric: Accuracy (higher is better)
| # | Model↕ | Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | Human | 0.847 | No | The Hateful Memes Challenge: Detecting Hate Spee... | 2020-05-10 | Code |
| 2 | RA-HMD (Qwen2-VL-7B) | 0.821 | No | Robust Adaptation of Large Multimodal Models for... | 2025-02-18 | Code |
| 3 | RA-HMD (LLaVA-1.5-7B) | 0.809 | No | Robust Adaptation of Large Multimodal Models for... | 2025-02-18 | Code |
| 4 | RA-HMD (Qwen2-VL-2B) | 0.791 | No | Robust Adaptation of Large Multimodal Models for... | 2025-02-18 | Code |
| 5 | RGCL (CLIP) | 0.788 | No | Improving Hateful Meme Detection through Retriev... | 2023-11-14 | Code |
| 6 | HateDetectron27 | 0.765 | Yes | Detecting Hate Speech in Memes Using Multimodal ... | 2020-12-23 | Code |
| 7 | Ron Zhu | 0.732 | Yes | Enhance Multimodal Transformer With External Lab... | 2020-12-15 | Code |
| 8 | Pro-Cap | 0.723 | No | Pro-Cap: Leveraging a Frozen Vision-Language Mod... | 2023-08-16 | Code |
| 9 | Vilio | 0.695 | No | Vilio: State-of-the-art Visio-Linguistic Models ... | 2020-12-14 | Code |
| 10 | Visual BERT COCO | 0.695 | No | The Hateful Memes Challenge: Detecting Hate Spee... | 2020-05-10 | Code |