Last updated: July 25, 2003Welcome to Language Technology World
LT World is the most comprehensive WWW information service and knowledge source on the wide range of technologies that deal with human language. The service is provided by the National Language Technology Competence Center at DFKI. Contents will constantly be improved. Please send corrections and pointers to missing information to firstname.lastname@example.org.
LT WORLD Breaking
also news on HLT-Central
New LT Ressources
Version 0.2 of Open Source Parallel Corpus: OPUS
0.2 of the corpus contains roughly 30 million tokens in
60 languages. OPUS is sentence aligned (1830 language pairs),
tokenized, and partly tagged. [See also
First Release of the TIGER Treebank - Version 1
first release of the TIGER Treebank (Version 1) consists
of app. 700,000 tokens (40,000 sentences) of German newspaper
text, taken from the Frankfurter Rundschau. The corpus was semi-automatically
tagged with syntactic structures. [For details see
LDC to Announce the Availability of two New Publications
The 2001 Communicator Evaluation is the second
publication to result
from the Communicator program. [...] All audio files have
been converted into SPHERE format; there are 53394 sphere files,
totalling approximately 102 hours of audio. [For more
ELRA to announce four new Language Resources
Available now Spanish Speech Corpus 1 and Spanish
TTS Corpus, Italien Speech Corpus 1 and Italian TTS Corpus.
W H A T E L S E . . .
Goodbye "e-mail," the French government says, and hello "courriel" the term that linguistically sensitive France is now using to refer to electronic mail in official documents.
The Culture Ministry has announced a ban on the use of "e-mail" in all government ministries, documents, publications or websites, the latest step to stem an incursion of English words into the French lexicon.
The ministry's General Commission on Terminology and Neology insists Internet surfers in France are broadly using the term "courrier electronique" (electronic mail) instead of e-mail a claim some industry experts dispute. "Courriel" is a fusion of the two words.
"Evocative, with a very French sound, the word 'courriel' is broadly used in the press and competes advantageously with the borrowed 'mail' in English," the commission said.