Language Modelling — LT World

LT World

Supporters

provided by

dfki logo

with support by

eu star logofp7 logo

through

meta logo
clarin logo

as well as by

bmbf logo

through

take logo

N.B.

This site uses Google Analytics to record statistics about site visits - see Legal Information.

You are here: Home kb Information & Knowledge Technologies Language Modelling

Language Modelling


Statistical Methods for Speech Recognition. Frederick Jelinek.
MIT Press. Cambridge, MA. 1997.

Foundations of Statistical Natural Language Processing.
Christopher D. Manning and Hinrich Schütze.
MIT Press. C



http://www.lt-world.org/hlt_survey/ltw-chapter1-5.pdf



A Statistical Language Models predicts a word given a sequence of already known words (i.e. the history). Ist can also be applied to other sequences of symbols (e.g. DNA). Very often the history contains just the previous two words. This is called a trigram. The parameters of statistical language models are estimated from a set of training examples. Data sparsity and smoothing of the estimates is one of the core problems. The best smoothing technique known so far is Kneser-Ney-Smoothing. Maximum-Entropy techniques are also under investigation and may be the method of choice for long-range language models (beyond trigram). Language models are used in text-compression, speech recognition, information retrieval and information extraction.


LM; SLM

Statistical Language Modeling; Language Modeling; Statistical Language Modelling;