home kb Information and Knowledge Technologies Spoken Language Input Acoustic Modelling in Speech Recognition Acoustic Modelling in Speech Recognition
External Links
Google Scholar
provided by
German Research Center for Artificial Intelligence
with support by
as well as by

Acoustic Modelling in Speech Recognition

definition: Modelling of basic recognition units in the microphone signal. These units are often phones (esp. if a large vocabulary is used), while systems with a small vocabulary sometimes use larger units like words. The acoustic signal is not used directly, but represented by spectral parameters derived from it. Spectral parameters that are often used are mel-frequency cepstral coefficients (MFCC's) or RASTA PLP coefficients (noise-robust linear predictive coding parameters), although many other parameter types, including parameters based on auditory processing or phonetic features, are also used sometimes. The models in most state-of-the-art systems are obtained through hidden Markov modelling (HMM), although dynamic time warping and neural nets are also used for acoustic modelling (the latter also in combination with HMM). A limited number of systems exist in which the acoustic modelling is not stochastic, but knowledge-based.
See also the corresponding HLT Survey chapter: http://www.lt-world.org/hlt_survey/ltw-chapter1-6.pdf
related project(s):
  • The CMU Sphinx Group Open Source Speech Recognition Engines (CMU Sphinx)
related organisation(s):
related person(s):
  • Tanja Schultz
  • Henning Reetz
  • Jacques Koreman
  • Steven Greenberg
  • Alex Waibel