definition: In acoustic phonetics, the speech signal is represented as a waveform (amplitude curve over time). Through subsequent frequency analysis (e.g., using an FFT), a spectrogram (frequency distribution over time) is generated. For automatic speech processing (e.g., recognition, synthesis), further derived and discretised representations are required, e.g. mel-cepstrum coefficients (see also DSP Techniques).
See also the corresponding HLT Survey chapter: http://www.lt-world.org/hlt_survey/ltw-chapter1-3.pdf
