SV-Ident

Survey Variable Identification

Textshttps://www.gesis.org/en/ssoar/home/information/grant-of-licencesIntroduced 2022-09-19

SV-Ident comprises 4,248 sentences from social science publications in English and German. The data is the official data for the Shared Task: “Survey Variable Identification in Social Science Publications” (SV-Ident) 2022. Sentences are labeled with variables that are mentioned either explicitly or implicitly.

The dataset supports the following tasks:

  • Variable Detection: identifying whether a sentence contains a variable mention or not.

  • Variable Disambiguation: identifying which variable from a given vocabulary is mentioned in a sentence.