HiXSTest

Hindi XSTest

TextsApache 2.0Introduced 2024-08-19

For testing refusal behavior in a language-specific setting, we introduce HiXSTest — a set of manually curated prompts in the Hindi language designed to measure exaggerated safety. It comprises 25 safe-unsafe pairs of prompts, carefully phrased to challenge the LLMs’ safety boundaries.