SQuADShifts

Provides four new test sets for the Stanford Question Answering Dataset (SQuAD) and evaluate the ability of question-answering systems to generalize to new data.

Source: The Effect of Natural Distribution Shift on Question Answering Models