PCC

Potsdam Commentary Corpus

Creative Commons Attribution-NonCommercial-ShareAlike

The Potsdam Commentary Corpus (PCC) is a corpus of 220 German newspaper commentaries (2.900 sentences, 44.000 tokens) taken from the online issues of the Märkische Allgemeine Zeitung (MAZ subcorpus) and Tagesspiegel (ProCon subcorpus) and is annotated with a range of different types of linguistic information.

The central subcorpus that we are making publicly available consists of 176 MAZ texts, which are annotated with

  • Sentence Syntax
  • Coreference
  • Discourse Structure (RST & PDTB)
  • Aboutness topics