CzEng 2.0 Parallel Corpus

Texts

Czech-English parallel corpus CzEng 2.0 consisting of over 2 billion words (2 "gigawords") in each language. The corpus contains document-level information and is filtered with several techniques to lower the amount of noise.

Source: Announcing CzEng 2.0 Parallel Corpus with over 2 Gigawords