Papers With Code 2 | ML Benchmarks, SotA Results & Code

The Sentences Involving Compositional Knowledge (SICK) dataset is a dataset for compositional distributional semantics. It includes a large number of sentence pairs that are rich in the lexical, syntactic and semantic phenomena. Each pair of sentences is annotated in two dimensions: relatedness and entailment. The relatedness score ranges from 1 to 5, and Pearson’s r is used for evaluation; the entailment relation is categorical, consisting of entailment, contradiction, and neutral. There are 4439 pairs in the train split, 495 in the trial split used for development and 4906 in the test split. The sentence pairs are generated from image and video caption datasets before being paired up using some algorithm.

Source: Multi-Label Transfer Learning for Multi-Relational Semantic Similarity Image Source: https://www.researchgate.net/figure/Example-of-SICK-dataset-sentence-expansion-process-14_fig1_344863619

SICK

Benchmarks

Related Benchmarks

SICK

Benchmarks

Related Benchmarks