External Links
Google Scholar
provided by
German Research Center for Artificial Intelligence
with support by
as well as by

Written Language Corpora

definition: Any collection of more than one text can be called a corpus, (corpus being Latin for "body", hence a corpus is any body of text). But the term "corpus" when used in the context of modern linguistics means a machine-readable text collection which is representative for the language use under investigation.
See also the corresponding HLT Survey chapter: http://www.lt-world.org/hlt_survey/ltw-chapter12-2.pdf
related project(s):
related organisation(s):
related person(s):
  • Michael Stubbs
  • Adam Kilgarriff
  • Geoffrey Sampson
  • Andrew Wilson
  • Mike Scott
  • Anke Lüdeling
  • David Lee
  • Wolfgang Teubert
  • John Sinclair
  • Michael Oakes
  • Tony McEnery
  • Oliver Mason
  • Christopher Manning
related system(s) / resource(s):
  • Brown Corpus
  • WordSmith Tools
  • Freiburg-LOB (FLOB) Corpus
  • European Corpus Initiative Multilingual (ECI/MCI 1) Corpus
  • MonoConc
  • SARA
  • Xkwic/CQP (IMS Corpus Workbench)
  • London-Lund Corpus (LLC)
  • American National Corpus (ANC)
  • Global English Monitor Corpus
  • SUSANNE Corpus
  • Lancaster-Oslo Bergen (LOB) Corpus
  • Bank of English (COBUILD Corpus)
  • Freiburg-Brown (FROWN) Corpus
related publication(s):

Corpus Linguistics: investigating language structure and use.
D. Biber and S. Conrad and R. Reppen.
CUP. Cambridge, 1998.

Corpus Linguistics.
T. McEnery and A. Wilson.
EUP. Edinburgh, 2001.