LT World

You are here: Home kb Resources & Tools Language Data Chinese SMS Corpus---Name Entity annotation (King-NLP-005)

Chinese SMS Corpus---Name Entity annotation (King-NLP-005)


All SMS sentences were manually proofread


  • Written

This data contains 1,200,000 SMS sentences collected from the real life of Chinese Native speakers. All short message sentences were proofread manually, repeated sentences were filtered in the pure word layer and all the sentences were annotated with name entity information.


http://www.speechocean.com/en-Text-Corpora/682.html

  • Chinese

  • Monolingual

  • Information Extraction Applications
  • Information Retrieval System
  • Information Retrieval Applications
  • Mining Applications

This data contains 1,200,000 SMS sentences collected from the real life of Chinese Native speakers. All short message sentences were proofread manually, repeated sentences were filtered in the pure word layer and all the sentences were annotated with name entity information.