LT World

You are here: Home kb Information & Knowledge Technologies Multilingual Corpora

Multilingual Corpora


A Guide to ParaConc.
M. Barlow.
Athelstan. Houston. 1995.

Dimensions of register variation: A cross-linguistic comparison.
D. Biber. Cambridge University Press. Cambridge. 1995.

Corpus Linguistics: Investigating Language Structure and Use.
D. Biber and S. Conrad and R. Reppen.
Cambridge University Press. Cambridge. 1998.




  • Knut Hofland
  • Stella Neumann
  • Stig Johansson
  • Michael Barlow
  • Wolfgang Teubert
  • Serge Sharoff
  • Douglas Biber
  • Silvia Hansen-Schirra
  • Silvia Bernardini
  • Josef Schmied
  • Martin Wynne

  • Multext-East
  • Parallel Corpora in Linköping, Uppsala, and Göteborg (PLUG)
  • Corpus Resources and Terminology Extraction (CRATER)
  • EMILLE
  • International Sample of English Contrastive Texts (INTERSECT)
  • Multilingual Text Tools and Corpora (Multext)

  • ET10-63 Parallel Corpus
  • TDT2 Multilanguage Text Corpus
  • Orwell's 1984 parallel English-Romanian Text
  • NATO Multilingual (Fr-De-En) Corpus
  • Multilingual translation corpus
  • JOC-CES Multilingual (En-De-Fr-It-Sp) Corpus
  • Swiss Government Multilingual (Fr-De-It) Corpus
  • Oslo Multilingual Corpus (OMC)
  • The Canadian Hansard Corpus
  • Intellectual Property and Copyright Multilingual (Fr-En) Corpus
  • English-Norwegian Parallel Corpus (ENCP)
  • CHILDES
  • BAF French - English Parallel Corpus
  • European Corpus Initiative Multilingual (ECI/MCI 1) Corpus
  • An Environment for Managing Corpus and Multilingual Web Server (XCorpus)
  • Chemnitz Internet Grammar
  • European Language Newspaper Text
  • Multilingual corpora for cooperation (MLCC)
  • ParaConc
  • Xkwic/CQP (IMS Corpus Workbench)
  • Bundesregierung Multilingual (Fr-De-En) Corpus
  • TELRI multilingual Plato corpus
  • ITU or CRATER Parallel (Sp-Fr-En) Corpus
  • European Free Trade Organization Multilingual (De-En) Corpus
  • English Turkish Aligned Parallel Corpora
  • Bible of University of Maryland Parallel Corpus
  • Concordancing and parallel corpora (ParaConc)

Any collection of more than one text in more than one language can be called a multilingual corpus, (corpus being Latin for "body", hence a multilingual corpus is any body of multilingual texts). But the term "multilingual corpus" when used in the context of modern linguistics means a machine-readable text collection of multilingual texts which are representative for the language use under investigation.