BanMANI

Introduced 2023-11-05

A Dataset to Identify Manipulated Social Media News in Bangla

We construct a publicly available Bangla dataset of 800 news-related social media items that are annotated as manipulated or not relative to 500 reference news articles. We present a semi-automatic (use both human and LLM) method for generating such a dataset, which allows scalable dataset collection using annotators efficiently for languages with few available NLP tools.