LT World

Sections
Personal tools
Log in

Skip to content. | Skip to navigation

Supporters

provided by

dfki logo

with support by

eu star logofp7 logo

through

meta logo
clarin logo

as well as by

bmbf logo

through

take logo

You are here: Home kb Resources & Tools Language Data Corpus of Bulgarian Texts

Corpus of Bulgarian Texts


Approximately 275 000 words.

news, fiction, poems

  • Bulgarian

  • Monolingual

Encoded with SGML according to the Corpus Encoding Standard (CES)

  • Linguistic Modelling Laboratory, Bulgarian Academy of Sciences

  • Written

Corpus of texts in Bulgarian, representing text genres such as news, legal, and poetry. Encoded with SGML according to the Corpus Encoding Standard (CES). Approximately 275 000 words.

 

The corpus includes the following texts:

 

  1. A selection of newspaper articles from "24 Hours", 1996
  2. A selection of newspaper articles from 'Zemedelsko zname' ("Agrarian Flag"), 1996
  3. A collection of 42 newspaper articles on Soros' "Open Society" Fund
  4. A selection of literary texts: a part of the novel "Love at the Age of Sclerosis" by Natasha Manolova; a part of the novel "The Big Fraud" by Vesela Lyutskanova; 11 short stories and a novella by Asen Sirakov; 12 short stories from the book "Cyclops' Eye" by Todor Velchev; a part of Snezhana Snegovana's novella "The Fiery Violin"
  5. A selection of poems from "We Are a Hopeless Case" by Miryana Ba sheva; a collection of modern Bulgarian love poetry "Love - a Reality of Magic" (many authors)
  6. Zhelyu Zhelev "Fascism" (2 chapters)
  7. Polya Goleva "Bulgarian Insurance Law"
  8. An unpublished sociological study about Bulgaria
  9. Bulgarian Fiction - 2 novels: Emilia Dvoryanova 'PASSION ili smy1rtta na Alisa' ("Passion or the Death of Alice"), Julia Berberyan 'Iskam, vyarvam, moga' ("I want, I believe, I can")
  10. Newspapers: a few issues of 'Capital' and "Continent' (1996)

Restrictions: not available to industrial users. Please contact the resource provider to negotiate licensing.


http://tractor.bham.ac.uk/tractor/resources/SOF2/corpus/sofia.tgz

  • rar

  • Web

  • Linguistic Modelling Laboratory, Bulgarian Academy of Sciences

  • Linguistic Modelling Laboratory, Bulgarian Academy of Sciences

http://www.telri.de/telri2/participants/partners/SOF2.html