LT World

Sections
Personal tools
Log in

Skip to content. | Skip to navigation

Supporters

provided by

dfki logo

with support by

eu star logofp7 logo

through

meta logo
clarin logo

as well as by

bmbf logo

through

take logo

You are here: Home kb Resources & Tools Language Data European Corpus Initiative Multilingual Corpus I (ECI/MCI)

European Corpus Initiative Multilingual Corpus I (ECI/MCI)


  • English
  • French
  • Dutch
  • Spanish
  • German

morphosyntactically

syntactic dependencies, POS

  • European Corpus Initiative (ECI)

  • POS-tagged Text Corpus

  • Written

The European Corpus Initiative (ECI) was founded to oversee the acquisition and preparation of a large multilingual corpus (ECI/MCI) to be made available in digital form for scientific research at a low a cost as possible. The corpus has been available on CD-ROM since 1994, and is being distributed by ELSNET.

 

Contents

Below you find a sampling of the contents of the CD-ROM. There is also a complete listing of the contents available. Read the READ-ME file on the CD-ROM

  • German newspaper texts from the Frankfurter Rundschau from July 1992 - March 1993. Provided by Universität Gesamthochschule, Paderborn, Germany. Approximately 34 million words.
  • French newspaper texts from Le Monde, consisting of material from September 1989, October 1989, and January 1990. Provided by LIMSI CNRS, France. Approximately 4.1 million words
  • Extracts from the Leiden Corpus of Dutch, consisting of newspapers, transcribed speech, etc. Provided by Instituut voor Nederlandse Lexicologie, Leiden, Holland. Approximately 5.5 million words
  • International Labor Organisation (ILO) "Official Bulletin, B Series". Vols LXVII(1984) - LXXII(1989). Parallel texts in English, French and Spanish. Provided by the International Labor Organisation. Approximately 5 million words.

http://www.elsnet.org/resources/eciCorpus.html

  • European Network of Excellence in Human Language Technologies (ELSNET)