ErAConD

Error Annotated Conversational Dialog Dataset for Grammatical Error Correction

TextsMIT LicenseIntroduced 2021-12-15

ErAConD is a novel GEC dataset consisting of parallel original and corrected utterances drawn from open-domain chatbot conversations.

We collected 186 dialogs containing 1735 user utterance turns of open-domain dialog data by deploying BlenderBot on Amazon Mechanical Turk (AMT) via LEGOEval.

This dataset is, to our knowledge, the first GEC dataset targeted to a human-machine conversational setting.