LT World

Personal tools
Log in

Skip to content. | Skip to navigation


provided by

dfki logo

with support by

eu star logofp7 logo


meta logo
clarin logo

as well as by

bmbf logo


take logo

You are here: Home kb Resources & Tools Language Data Penn Discourse Treebank (PDTB)

Penn Discourse Treebank (PDTB)

  • English

  • Monolingual

the Penn Discourse Treebank (PDTB) focuses on encoding coherence relations associated with discourse connectives

  • Penn Discourse Treebank (PDTB)
  • Driya Amandita
  • Katherine M. Forbes-Riley
  • Martha Palmer
  • Laura Whitton
  • Rashmi Prasad
  • Aravind K. Joshi
  • Alan Lee
  • Cassandre Creswell
  • Mitch Marcus
  • Emily Pawley
  • Jeremy Lacivita
  • Tom Morton
  • Ellen F. Prince
  • Bonnie Lynn Webber
  • Jason Teeple
  • Alex Channer
  • Eleni Miltsakaki
  • John Laury
  • Steven Pettingill
  • University of Pennsylvania

  • Written

The goal of the PDTB project is to develop a large scale corpus annotated with information related to discourse structure. While there are many aspects of discourse that are crucial to a complete understanding of natural language, the Penn Discourse Treebank (PDTB) focuses on encoding coherence relations associated with discourse connectives.


The annotations include the argument structure of the connectives, thus exposing a clearly defined level of discourse structure which will support the extraction of a range of inferences associated with discourse connectives. Some other annotated features associated with discourse connectives and their arguments include sense distinctions for discourse connectives, and attribution-related features for both connectives and their arguments. The annotations in the PDTB are linked to the Penn Treebank.


The PDTB is targeted to extend the scope of using large scale resources such as the PTB for a wide range of applications, ranging from parsing, information extraction, question-answering, summarization, machine translation, generation systems, as well as corpus based studies in linguistics and psycholinguistics. Since the PDTB will provide a substantial level of discourse structure information, the PDTB, together with the PTB, will raise the bar very substantially with respect to the quality and coverage achieved in the above mentioned applications.

  • Web