Multi-CPR

Multi Domain Chinese Dataset for Passage Retrieval

Introduced 2022-03-07

Multi-CPR is a multi-domain Chinese dataset for passage retrieval. The data is collected from three different domains, including E-commerce, Entertainment video, and Medical. Each dataset contains millions of passages and a certain amount of human-annotated query-passage-related pairs.