ABOUT RECOGNITION OF THE AUTHOR OF THE TEXT IN UZBEK BY MEANS OF SYMBOLIC TRIGRAMS

Authors: Kosimov A.A Zulfikarova P.E

Authors

Kosimov A.A. – candidate of technical sciences, senior teacher, Department of Programming and Information Systems, Politechnical institute of Tajik Technical University, Khujand, Republic of Tajikistan, abdunabi_kbtut@mail.ru.

Zulfikarova P.E. senior teacher, Department of Programming and Information Systems, Politechnical institute of Tajik Technical University, Khujand, Republic of Tajikistan,

zulfikarova.p@gmail.com.

Annotation

The article considers a model collection of Uzbek texts composed of works of classical poetry and modern prose in Cyrillic script. Each work is compared with a digital portrait – the distribution of frequencies of alphabetic trigrams. For problem solution of identification text’s author trigram is acceptable quantitative characteristics. Accounting for spaces in trigrams improves classification accuracy. Z.D. Usmanov’s classifier is used as a tool of task realization, which allows to identify the authors of text information by frequency of alphabetic-letter trigrams elements with rather high efficiency degree. Also, it is established that with the help of the Z.D. Usmanov’s classifier, the authors of the Uzbek language composition can be identified by a digital portrait.

Key words:

Uzbek language, trigram, frequency, statistics, efficiency.

Language

English

Type of article

economic

Year

2020

Page

34-46


Publication date