ARC-DA

ARC Direct Answer Questions

TextsIntroduced 2021-02-05

ARC Direct Answer Questions (ARC-DA) dataset consists of 2,985 grade-school level, direct-answer ("open response", "free form") science questions derived from the ARC multiple-choice question set released as part of the AI2 Reasoning Challenge in 2018.

How the dataset was built

These questions were derived from the ARC multiple-choice question set released as part of the AI2 Reasoning Challenge in 2018. The ARC Easy and ARC Challenge set questions in the original dataset were combined and then filtered/modified by the following process:

  • Turking: Each of the multiple-choice questions was presented as a direct answer question to five crowdsourced workers to gather additional answers.

  • Heuristic filtering: The questions were filtered based on the following heuristic filters:

    • Questions having a threshold number of turker answers, as a proxy for concreteness of the question.
    • Questions having at least two turker-provided answers with word overlap, as a measure of confidence in the correctness of the answers, and also straightforwardness of the question.
    • Other heuristics to identify questions that only make sense as multiple-choice questions, such as, questions starting with the phrase “Which of the following”.
  • Further manual vetting: We had volunteers in house do another pass of vetting where they:

    • Marked highly open-ended questions with too many answer choices, such as “Name an insect”, or otherwise invalid questions, for removal. These are filtered out.
    • Removed some of the bad answers gathered from turking.
    • Reworded questions to make them more suited to direct answer question format, for e.g., a question such as “What element is contained in table salt?” which would make sense as a multiple-choice question, needs be reworded to something like “Name an element present in table salt”.
    • Added any additional answers to the questions they could think of that were not present in the turker provided answers.

Image source: ARC-DA dataset