Skip to main content

5.2.2. Croatia

FLaReNet Summary

In Croatia, several institutions are involved in the field of language and speech technologies, building resources and tools in several projects (University of Zagreb (Faculty of Humanities and Social Sciences(Department/Institute of linguistics, Department of information sciences, Department of phonetics) and Faculty of electrical engineering and computing (Knowledge Technologies laboratory and Human-oriented technologies laboratory)), Institute of Croatian Language and Linguistics, University of Zadar, University of Rijeka, Zagreb University Computing Centre). Activities are coordinated through the the Croatian Language Technologies Society and information can be obtained through a Portal on Language Technologies for Croatian.

Contact Point Input

National/Regional contact: Marko Tadić, University of Zagreb, Faculty of Humanities and Social Sciences.

Several institutions in Croatia are involved in the field of language and speech technologies, building respective resources and tools in respective projects and subareas:

1) University of Zagreb, Faculty of Humanities and Social Sciences

a. Department/Institute of linguistics (contact person: prof. Marko Tadić, corpus linguistics, computational linguistics

i. Croatian National Corpus (

ii. Croatian Morphological Lexicon (

iii. Croatian Dependency Treebank (

iv. Research program Computational Linguistic Models and Language Technologies for Croatian ( comprising five distinctive projects (including 1.b. and 2.a.)

b. Department of information sciences (contact person: prof. Damir Boras, computational lexicography, computational linguistics

i. Research program Sources for Croatian heritage and Croatian European identity, portal Croatian lexicographic heritage ( where digitalisation of old Croatian dictionaries is being done, comprising six research projects

c. Department of phonetics (contact person: dr. Nikolaj Lazić, speech processing (diphone base for Croatian)

2) University of Zagreb, Faculty of electrical engineering and computing

a. Knowledge Technologies laboratory (, contact person: prof. Bojana Dalbelo Bašić, NLP in document classification, information retrieval and knowledge technologies

b. Human-oriented technologies laboratory (, contact person: dr. Igor S. Pandžić): speech and multimodal processing

3) Institute of Croatian Language and Linguistics

a. Croatian Language Corpus (

b. Project Semantic networks and computational lexicology (contact person: Damir Ćavar,

c. Project Digital processing of Croatian dialectal material (contact person: Željko Jozić,

4) University of Zadar

a. Department of Linguistics (contact person: Damir Ćavar, computational linguistics

5) University of Rijeka

a. Technical Faculty (contact person: dr. Ivo Ipšić, speech processing

b. Department of Information Sciences (contact person: dr. Sanda Martinčić-Ipšić, speech processing

6) Zagreb University Computing Centre

a. (with publishing company Novi Liber) Croatian Language Portal ( with on-line access to Croatian general-purpose dictionary

7) The Croatian Language Technologies Society ( is a national LR&T association

a. Portal Language Technologies for Croatian ( with extensive list of institutions, projects, language resources and tools for Croatian and other languages

The following institutions and groups participated in international projects:

(1.a.) FP5: TELRI (, 1991-1997); FP7: CLARIN (, 2008-), ACCURAT (http://www.accurat, 2010-, with 1.b.); ICT-PSP: LetsMT! (, 2010-, with 1.b.); CLARA (; bilateral projects: Slovenian-Croatian Parallel Corpus (2000-2001), CADIAL (Flemish-Croatian joint project,, 2007-2009, with 2.a.)

(1.b.) UNESCO: Croatian dictionary heritage and dictionary knowledge presentation (2003-2006)

(1.c.) MBROLA (, 1996-1998)

(2.b.) FP6: SIMILAR (, 2004-2007), HUMAINE (, 2004-2007), PEACH (, 2006-); COST: (Action 2102) Cross-Modal Analysis of Verbal and Non-verbal Communication (, 2006-2010); bilateral projects: NG-ECA (Japanese-Croatian project, 2005-)

(4.a.) ICT-PSP: ATLAS (2010-)

(5.b.) COST: 278 “Spoken Language Interaction in Telecommunication” – “SIG-Broadcast News Transcription (BNT); bilateral projects: Bilingual base of speech samples (Croatian-Slovenian bilateral project)