LCCC

Large-scale Cleaned Chinese Conversation corpus

Contains a base version (6.8million dialogues) and a large version (12.0 million dialogues).

Source: A Large-Scale Chinese Short-Text Conversation Dataset