arXivEdits
TextsIntroduced 2022-10-26
arXivEdits an annotated corpus of 751 full papers from arXiv with gold sentence alignment across their multiple versions of revision, as well as fine-grained span-level edits and their underlying intentions for 1,000 sentence pairs. This dataset is designed for studying the human revision process in the scientific writing domain.
Source: arXivEdits: Understanding the Human Revision Process in Scientific Writing
Image Source: https://arxiv.org/pdf/2210.15067v1.pdf