LT World

Personal tools
Log in

Skip to content. | Skip to navigation


provided by

dfki logo

with support by

eu star logofp7 logo


meta logo
clarin logo

as well as by

bmbf logo


take logo

You are here: Home kb Resources & Tools Language Data English Parser Evaluation Corpus

English Parser Evaluation Corpus

  • English

  • Monolingual

  • Syntax


syntactic dependencies, POS

  • POS-tagged Text Corpus

A parser evaluation corpus of English based on a grammatical relation annotation scheme is now available. It consists of 500 sentences (around 10000 words) extracted randomly from the SUSANNE corpus.


There are four files: the (tokenised) raw text, the lemmatised and numbered sentences, the grammatical relation annotation and software that can be used to automatically evaluate parser output. An up-to-date specification of the annotation scheme is also online. (Please note that this specification refers to the latest version of the annotated corpus, and supersedes the one in the publications listed below). The corpus is free for research purposes; for any proposed commercial use please contact John Carroll.

Descriptions of the grammatical relation annotation scheme are published in

Carroll, J., G. Minnen and E. Briscoe (in press) `Parser evaluation using a grammatical relation annotation scheme'. In A. Abeillé (ed.), Treebanks: Building and Using Syntactically Annotated Corpora, Dordrecht: Kluwer. More>

Carroll, J., G. Minnen and E. Briscoe (1999) `Corpus annotation for parser evaluation'. In Proceedings of the EACL-99 Post-Conference Workshop on Linguistically Interpreted Corpora, Bergen, Norway. 35-41. Also in Proceedings of the ATALA Workshop on Corpus Annotés pour la Syntaxe - Treebanks, Paris, France. 13-20. More>

Carroll, J., E. Briscoe and A. Sanfilippo (1998) `Parser evaluation: a survey and a new proposal'. In Proceedings of the 1st International Conference on Language Resources and Evaluation, Granada, Spain. 447-454.