Last updated: January 24th, 2008

Welcome to Language Technology World

LT World is the most comprehensive WWW information service and knowledge source on the wide range of technologies that deal with human language. The service is provided by the German Language Technology Competence Center at DFKI. Contents will constantly be improved. Please send corrections and pointers to missing information to

European Commission to support
Machine Translation with
one million Sentences

"The EC intends to boost human language technologies, support multilingualism and make computer-assisted translation easier, cheaper and more accessible."

The European Commission is going a step further in its efforts to foster multilingualism as a key part of European unity in diversity. The Commission's collection of about 1 million sentences and their high quality translations in 22 of the 23 official EU languages — including those of the new Member States — is the biggest ever collection in so many languages and is now freely available. This kind of data is highly sought after by developers of machine translation systems in which automatic translation software "learns" from manually translated texts how words and phrases are correctly and contextually translated. The data can also help the development of other linguistic software tools such as grammar and spell checkers, online dictionaries and multilingual text classification systems.

Leonard Orban, Commissioner for Multilingualism, says: "By this initiative the European Commission intends to boost human language technologies, support multilingualism and make computer-assisted translation easier, cheaper and more accessible. Citizens belonging to the smaller linguistic communities will have an easier access to documents and web pages only available in the most used languages."

The EU institutions have more multilingual texts than any other organisation because of the requirements that EU law exist in each of its 23 official languages. Their translation services work with 253 possible language pair combinations and produce around 1.5 million translated pages a year.

Translation Memory site:


See also:,]

     LT NEWS
Powerset's 2008 promise: A Wikipedia search engine
Wiki citizens taking on a new area: Searching
The Google challengers: 2008 edition.
Business and website globalization, technology and business model predictions for 2008 released by Common Sense Advisory.
Makayama rolls out Voice Dial for iPhone.
Dragon Speech Recognition comes to the Mac.
Smart system boosts speed of subtitling.
The "Google generation" and emerging web behavior.
Linguamatics and 81qd sign multi-year collaboration in Knowledge Discovery for product life cycle management.
living-e AG and Xtramind joined forces to become leading Web 3.0 provider.

EU data regulator says Internet addresses are personal information.

Sony BMG to start selling music downloads without copy protection in North America.

W3C releases recommendation to access data on the web with SPARQL.
Google: Waiting to Knol

Online Love Seekers
warned of Flirt-Bots


fully automated flirtatious conversations to collect personal data

Online security firm PC Tools has warned of a new software program developed in Russia, which flirts with people seeking relationships online in order to collect their personal data.

The software, dubbed CyberLover, is supposed to be able to conduct fully automated flirtatious conversations with users of chat-rooms and dating sites to lure them into a set of dangerous actions such as sharing their identity or visiting websites with malicious content.

According to its creators, CyberLover can establish a new relationship with up to 10 partners in just 30 minutes and its victims cannot distinguish it from a human being.

According to PC Tools the CyberLover software can operate within several profiles ranging from 'romantic lover' to 'sexual predator' and is designed to recognise the responses of chat-room users to tailor its interaction accordingly.

[More on:

See also: ]