LT World

You are here: Home kb Information & Knowledge Technologies Spoken Language Corpora

Spoken Language Corpora

Handbook of Standards and Resources for Spoken Language Systems.
Dafydd Gibbon and Roger Moore and Richard Winski.
Walter de Gruyter. Berlin, Germany. 1997.

Language Resources and Evaluation Conference.


Spoken language corpora are collections of recorded spoken language, generally associated with transcriptions of speech and noises, and with annotations at different linguistic levels. Speech corpora can contain read speech, spontaneous speech, dialogues and may be recorded under different conditions with regard to microphones, environment (e.g., laboratory, office, background noise), and transmission channel (e.g., telephone, broadcast). Speech corpora are used for different purposes, including training and evaluation of speech recognisers, phonetic and phonological research, dialect research, dialogue research, and speech synthesis.

Spoken Corpora