LT World

Sections
Personal tools
Log in

Skip to content. | Skip to navigation

Supporters

provided by

dfki logo

with support by

eu star logofp7 logo

through

meta logo
clarin logo

as well as by

bmbf logo

through

take logo

You are here: Home kb Resources & Tools Language Data The Prague Dependency Treebank (PDT)

The Prague Dependency Treebank (PDT)


http://ufal.mff.cuni.cz/pdt2.0/doc/pdt-guide/en/html/ch05.html

7,110 manually annotated textual documents, containing altogether 115,844 sentences with 1,957,247 tokens

  • Czech

  • Monolingual

  • Syntax
  • Semantics
  • Morphology

POS

morphosyntactically, semantically

syntactic dependencies, POS

  • Charles University in Prague

The Prague Dependency Treebank 2.0 (PDT 2.0) contains a large amount of Czech texts with complex and interlinked morphological (2 million words), syntactic (1.5 MW) and complex semantic annotation (0.8 MW); in addition, certain properties of sentence information structure and coreference relations are annotated at the semantic level. PDT 2.0 is based on the long-standing Praguian linguistic tradition, adapted for the current Computational Linguistics research needs. The corpus itself uses the latest annotation technology. Software tools for corpus search, annotation and language analysis are included. Extensive documentation (in English) is provided as well


http://ufal.mff.cuni.cz/pdt2.0/

  • Charles University in Prague

2.0