CAsT-answerability

TextsIntroduced 2024-03-24

CAsT-answerability dataset contains binary answerability labels on three levels: sentence, passage, and ranking. It contains around 1.8k answerable and 1.9k unanswerable question-passage pairs. Sentence- and passage-level answerability is divided into train (90%), and test (10%) portions; the splitting is done on the question level to avoid information leakage. Ranking-level answerability has only a test set.