CLaDA-BG: National Interdisciplinary Research E-Infrastructure for Bulgarian Language and Cultural Heritage Resources and Technologies integrated within European CLARIN and DARIAH infrastructures


Short History

Bulgaria participated on conceptual level in CLARIN and DARIAH infrastructure initiatives already in their inception years - 2005/2006. Bulgaria was a participant in the European CLARIN Preparatory Phase project that ran from 2008 to 2011. In 2010, BG CLARIN was established as a national research infrastructure but unfortunately it did not receive any financial support. In 2012, Bulgaria signed the Memorandum of Understanding (MoU) and became one of the nine founding members of CLARIN ERIC.

In 2014, there was a call for renewing the roadmap of national research infrastructures in Bulgaria. There were two proposals related to both initiatives - BG CLARIN and DARIAH BG national research infrastructures. After an international review, the Bulgarian Ministry of Education and Sciences proposed the two infrastructures to be merged into one research infrastructure. The new proposed infrastructure was named CLaDA-BG = CLARIN and DARIAH in Bulgaria. The full name is Bulgarian national research infrastructure for resources and technologies for language, cultural and historical heritage, integrated within CLARIN EU and DARIAH EU.



To establish a national technological infrastructure of language, cultural and historic heritage (CHH) resources and technologies which to provide public access to language and CHH resources, tools for Bulgarian language processing and tools for access and management of CHH datasets for various societal tasks, targeted at wide audience.
The infrastructure will support especially researchers in Art, Humanities and Social Sciences to process Bulgarian language texts and CHH datasets necessary for their research.



The project CLaDA-BG has several important goals:

  • To establish the Bulgaria-centric Knowledge Graph, which to represent important for Bulgaria knowledge for People, Places, Organizations, Artefacts, and Documents.
  • to create a network of Bulgarian language and CHH resources and tools, which to provide access to data and tools for their wider usage in everyday life.
  • to develop and to install on an Internet portal several web applications, which to demonstrate the potential of the infrastructure of language technologies and the necessity of creation of language resources for end users.
  • to test these applications with end users in a remote mode;
  • to organize lectures at universities and specialized information days, in order to improve the awareness about the benefits of the implemented approach. Special attention will be paid to specialists in the Arts and Humanities fields in order to introduce the concepts of corpus linguistics and digital humanities in their education. Due to the lack of sufficient literature in Bulgarian, we will create and publish learning materials to be made publicly available;
  • to establish the basis for CLaDA-BG society in Bulgaria via supporting a web site, wiki, blog, mailing list, bulletins, etc.

EU Context and Financial Support