GeBioCorpus

Texts

A high-quality dataset for machine translation evaluation that aims at being one of the first non-synthetic gender-balanced test datasets.

Source: GeBioToolkit: Automatic Extraction of Gender-Balanced Multilingual Corpus of Wikipedia Biographies