Automatic text annotation Description
Automatic text annotation with Language Models
This CLaDA-BG Service provides automatic annotation of Bulgarian texts. Currently, it’s accessible through a web interface, with plans to develop a web application in the future.
Automatic text annotation uses multiple language models for various linguistic tasks, that combined provide analysis in several key areas.
To annotate text, enter it into the input field and click “Annotate”. By default, the results will be displayed as a table below the input field.
After the text input, a language model breaks down the text into words and sentences. This is followed by parallel linguistic analyses of each sentence using multiple models. For each sentence, an analysis is shown including a table and a syntax tree diagram displayed below.
The result of the automatic annotation is shown in the nine columns of the table as follows:
- In the ID and Form columns the tokens are shown with their position in the sentence (ID) and their word form (Form).
- In the Lemma column, the base form of the word (lemma) is shown, which can be used for linking with different types of lexicons.
- The UPOS, XPOS and Feats columns display the grammatical features of the words. According to the UD (Universal Dependencies) standard, UPOS and Feats indicate each word’s universal part of speech (UPOS) and the associated grammatical features (Feats). The XPOS column also tags each word with its grammatical features using a positional tagset developed in the BulTreeBank project. Hovering over each label reveals a tooltip with additional details.
- The Head Index and Dep Rel columns display the syntax annotation. It includes the head word index (Head Index) and the corresponding syntactic relation (Dep Rel).
- The last NER column shows the named entity label, used for information retrieval from the Knowledge base, developed within the CLaDA-BG infrastructure, as well as for document indexing.
* Hovering with the cursor over each cell reveals additional information.
EU Context and Financial Support


