Papers With Code 2 | ML Benchmarks, SotA Results & Code

BPersona-chat is an evaluation dataset based on the English multiturn chat corpus Persona-chat and the Japanese multiturn chat corpus JPersona-chat.

Each chat was performed between two crowd workers assuming artificial personas. The speakers discuss a given personality trait, including but not limited to self-introduction, hobby, and others. (Notice that they are not translations of each other.)

Chats are translated into Japanese/English by professional translators, a low-quality machine translation model A and a high-quality machine translation model B.

Translations are evaluated by crowdworkers as either good or bad, depending on the correctness and coherence.

Each chat is included in one .xlsx file with the following structure:

person - the speaker on the current utterance, source - the utterance in the source language, translation - the translation in the target language, evaluation: is this a good translation? - the evaluation of the translation's quality, y - the current translation is a correct translation of the source utterance, n - the current translation is an erroneous translation of the source utterance.