Information about UniTurk
As part of the conference, the traditional UniTurk seminar will be held.
During previous UniTurk seminars (Kazan, 2014; Istanbul, 2014; Kazan, 2015; Bishkek, 2016), the problems of a unified morphological text annotation in Turkic languages for corpora and other automatic text processing systems were discussed. Such a unified annotation system would also serve as a universal means for text examples glossing (for instance, in international publications).
In February 2014 in Kazan, a working version of morphological annotation was adopted followed up with a discussion of its certain points (through a series of face-to-face and virtual seminars). The adopted annotation version focuses on the morphemic structure of a Turkic wordform and it is designed to fully reflect the diversity of different Turkic languages. The following questions are suggested for discussion at the next Uniturk seminar:
1. Grammatical (morphological and word-formative) and semantic annotation. Differentiation of semantic and grammatical tags (categories of numerals, voices, etc.). Representation of polyfunctional affixes.
2. The degree of completeness and specification of the annotation.
3. The problem of synthetic and analytical expression of certain categories in different languages (for example, the so-called “instrumental case”, ways of expressing interrogation, modality, etc.).
4. Representation of verbal affixes: preparing comparative tables.
We welcome your suggestions on the unification of annotation systems, which will be posted on the website of the seminar.
The existing annotation systems and publications on this topic are available on the webpage of the seminar.