FLoRes
Facebook Low Resource MT Benchmark
FLoRes is a benchmark dataset for machine translation between English and four low-resource languages, Nepali, Sinhala, Khmer, and Pashto, based on sentences translated from Wikipedia. The FLoRes project has two versions: ** FLoRes-101** and ** FLoRes-200**.
-
** FLoRes-101**: This was the first version of the dataset. It allowed researchers to measure the quality of translations through 10,100 different translation directions.
-
** FLoRes-200**: This is an updated version of the dataset. It doubles the existing language coverage of FLoRes-101. Given the nature of the new languages, which have less standardization and require more specialized professional translations, the verification process became more complex.