Pre-trained Transliterated Embeddings for Indian Languages

TextsCC-BY-NC-4.0Introduced 2021-12-27

We release various types of word embeddings for multiple Indian languages. Please note that for a majority of our work, we had transliterated the corpora to the Devanagiri script and the script is changed. Word Embedding models using FastText, ElMo, and cross-lingual models based on an orthogonal alignment of monolingual models for all pairs of these languages.