LT World

Sections
Personal tools
Log in

Skip to content. | Skip to navigation

Supporters

provided by

dfki logo

with support by

eu star logofp7 logo

through

meta logo
clarin logo

as well as by

bmbf logo

through

take logo

You are here: Home kb Resources & Tools Language Data Croatian National Corpus (HNK)

Croatian National Corpus (HNK)


101.3 million tokens

  • Croatian

  • Monolingual

The Corpus is accompanied by additional linguistic and non-linguistic data

  • POS-tagged Text Corpus

  • Written

Croatian National Corpus (HNK) is a systematized collection of selected texts mainly written in contemporary Croatian covering different media, genres, styles, fields and topics. The Corpus is accompanied by additional linguistic and non-linguistic data and stored in a database on our server which can be accessed with the search client program Bonito.

HNK is publicly available for research, education and other non-commercial purposes. Commercial users should register and subscribe in order to get their user account and password.

Corpus is published "as-it-is" in the form available for search, but it could be also be subsumed to a changes without obligation for prior notice to users. The list of (new) sources will be always available.

Provisional access

While the Corpus is still being collected, the temporary provisional access has been granted without any registration. This acces is available with guest account and without password. However, this access is limited in options for ad hoc subcorpora generation according to selected criteria. It is also limited in options for complex queries.


http://hnk.ffzg.hr/default_en.htm

  • Web

"Marko Tadic" <[email protected]>

for research, education and other non-commercial purposes