LT World

Sections
Personal tools
Log in

Skip to content. | Skip to navigation

General Information
  • Language Technology
  • About LT World
  • Intern. Advisory Board
  • LT World Back Issues

ACL Anthology Searchbench Logo

Supporters

provided by

dfki logo

with support by

eu star logofp7 logo

through

meta logo
clarin logo

as well as by

bmbf logo

through

take logo

You are here: Home kb Resources & Tools Language Data

Monolingual Language Data

  • Canadian English Speech Recognition Database ---- Place Name (Desktop)-150 Speakers (King-ASR-102)
  • Russian Speech Recognition Database ---- (in-car)-308 Speakers (King-ASR-153)
  • Japanese Speech Recognition Database ---- Digit String (Telephone)-200 Speakers (King-ASR-056)
  • Chinese SMS Corpus---Pinyin Information (King-NLP-003)
  • Korean Speech Recognition Database ---- Sentences (Desktop)-40 Speakers (King-ASR-050)
  • BTB-POS Corpus I
  • Chinese SMS Corpus—Word Segmentation (King-NLP-004)
  • The LUCY Corpus
  • Chinese Mandarin Speech Recognition Database ---- Person Name(Desktop)-200 Speakers (King-ASR-044)
  • TIMIT Acoustic-Phonetic Continuous Speech Corpus (TIMIT )
  • UK English Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-177)
  • Chinese Video transcription and annotation databaseⅡ (King-AVT-002)
  • Spain Spanish Speech Recognition Database ---- (in-car)-300 Speakers (King-ASR-141)
  • TheMarker Corpus
  • Korean Speech Recognition Database ---- SMS(Mobile)-200 Speakers (King-ASR-164)
  • Occidental Chinese Speech Recognition Database ---- (Desktop)-300 Speakers (King-ASR-127)
  • Chinese Webpage Words Corpus (King-NLP-012)
  • The TIGER Treebank
  • Canadian English Speech Recognition Database ---- Person Name (Telephone)-150 Speakers (King-ASR-105)
  • The SUSANNE Corpus
  • TDT2 Mandarin Audio Corpus
  • Chinese Mandarin Speech Recognition Database ---- Digit String(Desktop)-200 Speakers (King-ASR-046)
  • Russian Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-183)
  • Swedish Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-158)
  • France French Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-171)
  • Korean Speech Recognition Database ---- Digit String (Desktop)-110 Speakers (King-ASR-049)
  • German Speech Recognition Database ---- (in-car)-300 Speakers (King-ASR-150)
  • Italian Speech Recognition Database ---- (in-car)-300 Speakers (King-ASR-147)
  • Database of Arab Names in Arabic (King-NLP-028)
  • Japanese Speech Recognition Database ---- (Mobile)-800 Speakers
  • Japanese Speech Recognition Database ---- SMS(Mobile)-200 Speakers (King-ASR-165)
  • The CHRISTINE Corpus
  • Japanese Lexical Database (King-NLP-023)
  • Kyoto University Text Corpus
  • Chinese Mandarin Speech Recognition Database ---- Simple Sentences (Desktop)-200 Speakers (King-ASR-043)
  • Chinese Person Name Corpus (King-NLP-009)
  • Chinese Mandarin Speech Recognition Database ---- Digit String(Desktop)-200 Speakers (King-ASR-042)
  • The Penn Treebank (PTB)
  • Russian Pronunciation Lexicon (King-Lexicon-003)
  • Sinica Treebank
  • Database of Japanese Name Variants (King-NLP-022)
  • Chinese Speech Recognition Database ---- (Mobile)-1200 Speakers (King-ASR-118)
  • Japanese Speech Recognition Database ---- Person Name (Telephone)-200 Speakers (King-ASR-057)
  • Spanish Treebank (UAM)
  • Canadian English Speech Recognition Database ---- Sentences (Desktop)-150 Speakers (King-ASR-099)
  • Canadian English Speech Recognition Database ---- Digit String (Telephone)-150 Speakers (King-ASR-104)
  • Penn Discourse Treebank (PDTB)
  • Japanese Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-051)
  • UK English speech Recognition Database—(Mobile)--150 speakers (King-ASR-136)
  • UK English Speech Recognition Corpus (desktop) – 50 speakers (King-ASR-089)
  • Lincoln Lab Speech Enhancement Corpus (LLSEC)
  • Chinese SMS Corpus---Name Entity annotation (King-NLP-005)
  • UK English Pronunciation Lexicon (King-Lexicon-005)
  • US English Pronunciation Lexicon (King-Lexicon-004)
  • Canadian English Speech Recognition Database ---- Place Name (Telephone)-150 Speakers (King-ASR-106)
  • Mexican Spanish speech Recognition Database—(Mobile)--150 speakers (King-ASR-143)
  • Japanese Speech Recognition Database ---- Sentences (Desktop)-500 Speakers (King-ASR-175)
  • France French Speech Recognition Corpus (desktop) – 50 speakers (King-ASR-088)
  • Spain Spanish speech recognition Database——(Desktop)-210 speakers (King-ASR-202)
  • UK English Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-177)
  • US English speech Recognition Database—(Mobile)--150 speakers (King-ASR-139)
  • Russian Speech Recognition Database ---- (Desktop)-201 Speakers (King-ASR-115)
  • Arutz 7 Corpus
  • Canadian French Pronunciation Lexicon (King-Lexicon-002)
  • Czech ELAN Corpus
  • Mexican Spanish Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-179)
  • Canadian English Speech Recognition Database ---- Sentences (Telephone)-150 Speakers (King-ASR-103)
  • Italian speech Recognition Database—(Mobile)--150 speakers (King-ASR-148)
  • American Mexican Spanish Speech Recognition Database –(Desktop )--200 speakers (King-ASR-185)
  • London-Lund Corpus of spoken English (LLC)
  • ZAG-ELAN Croatian Corpus
  • USA Mexican Spanish Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-180)
  • Japanese Speech Recognition Database ---- (in-car)-800 Speakers (King-ASR-125)
  • The Freiburg - LOB Corpus of British English (FLOB)
  • American English Speech Recognition Database ---- Sentences (Desktop)-150 Speakers (King-ASR-107)
  • Corpus of Bulgarian Texts
  • Canadian French Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-173)
  • Russian Speech Recognition Database—(Mobile)--150 speakers (King-ASR-154)
  • Italian Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-181)
  • Chinese SMS Corpus (King-NLP-002)
  • British National Corpus (BNC)
  • The Prague Dependency Treebank (PDT)
  • Chinese Mandarin Speech Recognition Database ---- Place Name (Desktop)-200 Speakers (King-ASR-045)
  • The International Corpus of English (ICE)
  • Chinese Mandarin Speech Recognition Database ---- (in-car)-1200 Speakers (King-ASR-120)
  • Turkish Pronunciation Lexicon (King-Lexicon-010)
  • Chinese Mandarin Pronunciation Lexicon (King-Lexicon-001)
  • The HCRC Map Task Corpus
  • Spain Spanish speech Recognition Database—(Mobile)--150 speakers (King-ASR-142)
  • Spain Spanish Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-178)
  • Japanese Speech Recognition Database ---- (Mobile)-800 Speakers (King-ASR-117)
  • Korea Speech Recognition Database ---- Simple Sentences (Desktop)-500 Speakers (King-ASR-174)
  • Project Gutenberg (PG)
  • Turkish Speech Recognition Database ---- Sentences (Desktop)-201 Speakers (King-ASR-159)
  • NEGRA Corpus
  • American English Speech Recognition Database ---- Place Name (Desktop)-150 Speakers (King-ASR-110)
  • Croatian National Corpus (HNK)
  • German Speech Recognition Database—(Mobile)--150 speakers (King-ASR-151)
  • Chinese Mandarin Speech Recognition Database ---- SMS(Mobile)-300 Speakers (King-ASR-188)
  • The PARC 700 Dependency Bank
  • Chinese Chatting CorpusⅡ (King-NLP-008)
  • Australian English Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-176)
  • Korean Speech Recognition Database ---- Person Name (Desktop)-150 Speakers (King-ASR-047)
  • European French Pronunciation Lexicon (King-Lexicon-012)
  • US Spanish Recognition Database ---- (Mobile)-40 Speakers (King-ASR-119)
  • Chinese SMS Corpus---Word (King-NLP-006)
  • Korean Speech Recognition Database—(Mobile)--1000 speakers (King-ASR-137)
  • Lancaster Parsed Corpus (ICAME)
  • Italian Speech Recognition Corpus (desktop) – 50 speakers (King-ASR-091)
  • HaAretz Corpus
  • Japanese Speech Recognition Database ---- Digit String (Desktop)-200 Speakers (King-ASR-052)
  • German speech recognition Database –(Desktop )--200 speakers (King-ASR-187)
  • Database of Arab Names (King-NLP-027)
  • German Pronunciation Lexicon (King-Lexicon-008)
  • American English Speech Recognition Database ---- Person Name (Desktop)-150 Speakers (King-ASR-109)
  • Korean Speech Recognition Database ---- Place Name (Desktop)-150 Speakers (King-ASR-048)
  • Japanese Phonological Database (King-NLP-020)
  • France French speech recognition Database——(Desktop)-200 speakers (King-ASR-203)
  • The LinGO Redwoods Treebank
  • Korean Speech Recognition Database ---- (in-car)-1000 Speakers (King-ASR-121)
  • Japanese Companies and Organizations (King-NLP-026)
  • The Bank of English (COBUILD Corpus)
  • Canadian English Speech Recognition Database ---- Person Name (Desktop)-150 Speakers (King-ASR-101)
  • Chinese Mandarin Speech Recognition Database ---- (in-car)-100 Speakers (King-ASR-122)
  • UK English Speech Corpus for TTS (Female) (King-TTS-006)
  • Large Scale Syntactic Annotation of written Dutch (LASSY)
  • Chinese Mandarin Speech Recognition Database ---- Person Name and Place Name (Telephone)-285 Speakers (King-ASR-002)
  • European Spanish Pronunciation Lexicon (King-Lexicon-007)
  • Canadian English Speech Recognition Database ---- Digit String (Desktop)-150 Speakers (King-ASR-100)
  • TDT2 English Audio Corpus
  • Chinese Mandarin Speech Recognition Database ---- Sentences (Desktop)-1000 Speakers
  • Hebrew Treebank
  • US English Speech Recognition Corpus (desktop) – 50 speakers (King-ASR-090)
  • Japanese Speech Recognition Database ---- (Mobile)-800 Speakers (King-ASR-117)
  • German Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-182)
  • The Brown University Standard Corpus of Present-Day American English (Corpus BROWN )
  • American English Speech Recognition Database ---- Place Name (Telephone)-150 Speakers (King-ASR-114)
  • METU- Sabanci Turkish Treebank (METU)
  • Chinese Micro-blog Text Corpus (King-NLP-013)
  • Turkish in-car speech Recognition Database(316 speakers) (King-ASR-134)
  • Portugal Portuguese Speech Recognition Database ---- Sentences (Desktop)-200 Speakers (King-ASR-146)
  • US English Speech Recognition Database ---- (in-car)-300 Speakers (King-ASR-131)
  • American English Speech Recognition Database ---- Sentences (Telephone)-150 Speakers (King-ASR-111)
  • Japanese Speech Recognition Database ---- Place Name (Telephone)-200 Speakers (King-ASR-058)
  • Chinese Webpage Text Corpus (King-NLP-014)
  • UK English Speech Recognition Database ---- (in-car)-300 Speakers (King-ASR-135)
  • Chinese Mandarin Speech Recognition Database ---- SMS (Desktop)-120 Speakers (King-ASR-012)
  • US Mexican Spanish Speech Recognitioncar Database ---- (in-car)-300 Speakers (King-ASR-145)
  • The American National Corpus (ANC)
  • Japanese Speech Recognition Database ---- Person Name (Desktop)-200 Speakers (King-ASR-053)
  • English Parser Evaluation Corpus
  • SALSA Corpus (SALSA II )
  • The British component of the International Corpus of English (ICE-GB)
  • Japanese Speech Recognition Database ---- Sentences (Telephone)-200 Speakers (King-ASR-055)
  • American English Speech Recognition Database ---- Digit String (Telephone)-150 Speakers (King-ASR-112)
  • France French Speech Recognition Database ---- (in-car)-300 Speakers (King-ASR-132)
  • Chinese Address Corpus (King-NLP-011)
  • France French speech Recognition Database—(Mobile)--150 speakers (King-ASR-133)