BLiMP
Benchmark of Linguistic Minimal Pairs
TextsUnknownIntroduced 2019-12-02
BLiMP is a challenge set for evaluating what language models (LMs) know about major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each containing 1000 minimal pairs isolating specific contrasts in syntax, morphology, or semantics. The data is automatically generated according to expert-crafted grammars. Aggregate human agreement with the labels is 96.4%.
Source: BLiMP Image Source: https://arxiv.org/pdf/1912.00582v3.pdf