Wikipedia Title

Texts

Wikipedia Title is a dataset for learning character-level compositionality from the character visual characteristics. It consists of a collection of Wikipedia titles in Chinese, Japanese or Korean labelled with the category to which the article belongs.

Source: https://arxiv.org/abs/1704.04859