CSCD-IME
TextsMIT LicenseIntroduced 2022-11-16
Chinese Spelling Correction Dataset for errors generated by pinyin IME (CSCD-IME), a dataset containing 40,000 annotated sentences from real posts of official media on Sina Weibo. It is designed to detect and correct spelling mistakes in Chinese texts.
Source: CSCD-IME: Correcting Spelling Errors Generated by Pinyin IME
Image Source: https://github.com/nghuyong/cscd-ime