Last updated: July 25, 2003
Welcome to Language Technology World

LT World is the most comprehensive WWW information service and knowledge source on the wide range of technologies that deal with human language. The service is provided by the National Language Technology Competence Center at DFKI. Contents will constantly be improved. Please send corrections and pointers to missing information to feedback@lt-world.org.

General Information on Language Technology About LT World International Advisory Board  

LT WORLD Breaking News News Archive    see also news on HLT-Central
 

New LT Ressources

Version 0.2 of Open Source Parallel Corpus: OPUS

Version 0.2 of the corpus contains roughly 30 million tokens in 60 languages. OPUS is sentence aligned (1830 language pairs), tokenized, and partly tagged. [See also
http://listserv.linguistlist.org/cgi-
bin/wa?A2=ind0307&L=corpora&
D=1&F=&S=&P=3175
]

First Release of the TIGER Treebank - Version 1

The first release of the TIGER Treebank (Version 1) consists of app. 700,000 tokens (40,000 sentences) of German newspaper text, taken from the Frankfurter Rundschau. The corpus was semi-automatically tagged with syntactic structures. [For details see
http://www.ims.uni-stuttgart.de/
projekte/TIGER/
]

LDC to Announce the Availability of two New Publications

The 2001 Communicator Evaluation is the second publication to result from the Communicator program. [...] All audio files have been converted into SPHERE format; there are 53394 sphere files, totalling approximately 102 hours of audio. [For more
http://listserv.linguistlist.org/cgi-
bin/wa?A2=ind0307&L=corpora&
D=1&F=&S=&P=430
]

ELRA to announce four new Language Resources

Available now Spanish Speech Corpus 1 and Spanish TTS Corpus, Italien Speech Corpus 1 and Italian TTS Corpus. [For more
http://listserv.linguistlist.org/cgi-
bin/wa?A2=ind0307&L=corpora&
D=1&F=&S=&P=4933
]

Short News

W H A T    E L S E   . . .


Microsoft to release first public Beta of its Speech Server.

. . .

Selection for E.W. Beth Dissertation Prize for the Year 2003 concluded.
. . .

LREC 2004 in Lisbon, Portugal
26-27-28 May
.
. . .

COLING 2004 in Geneva, Switzerland
August 23rd-27th
.
. . .

SRI gets $22 million of the DARPA funded PAL program and takes responsibility for overall project direction.
. . .

CMU Computer Science received initial $7 million from DARPA to develop Reflective Agents within the PAL program.
. . .

Speech Recognition Software to save Millions in Banking Industry.
. . .

Revenue Record for SpeechWorks in the Second Quarter.
. . .

Leicestershire Police to select ScanSoft's OmniPage Pro 12 Office to improve Search for Case-Related Information.
. . .

TiVo to implement TuVox Conversational Voice Response Solution.
. . .

Maxxar Corporation to announce Speaker Verification Component on its Speech Recognition Platform.
. . .


Successful Mission Impossible for Coders: A Test of the Computer Science Community's Ability to create Translation Tools quickly.
. . .

Web Access for the Blind via Phone by InternetSpeech and NFB Team.
. . .

IBM and Xybernaut to deploy Mobile Computers for Children with Special Needs.
. . .

Gartner Dataquest to identify Semantic Web as one of four new technologies to push Web Services.
. . .

Wizzard Software to offer Transcription Solutions with IBM ViaVoice.
. . .

Decrypting the Sense of Instant Messages.
. . .

New Speech-Technology Applications to affect Small-Office and Home-Office Market.
. . .

L&H Founders to pay a $539 million Verdict in Dictaphone Case.
. . .

InBoxer Software from Audiotrieve prevents from Spam overlaod in MS Outlook.
. . .

Mandy Pet, recently affiliated with Oracle now to run TRADOS Government Solutions Group.
. . .


And next we pass a law, replacing spam with
jamicé jambon epicé !

Goodbye "e-mail," the French government says, and hello "courriel" — the term that linguistically sensitive France is now using to refer to electronic mail in official documents.

The Culture Ministry has announced a ban on the use of "e-mail" in all government ministries, documents, publications or websites, the latest step to stem an incursion of English words into the French lexicon.

The ministry's General Commission on Terminology and Neology insists Internet surfers in France are broadly using the term "courrier electronique" (electronic mail) instead of e-mail — a claim some industry experts dispute. "Courriel" is a fusion of the two words.

"Evocative, with a very French sound, the word 'courriel' is broadly used in the press and competes advantageously with the borrowed 'mail' in English," the commission said.

[Read more
http://www.wired.com/news/
culture/0,1284,59674,00.html
]