Authors
Kosimov A.A. – candidate of technical sciences, senior teacher, Politechnical institute of Tajik Technical University, Khujand, Republic of Tajikistan, abdunabi_kbtut@mail.ru
Annotation
The task of recognition of authors of works separately for classical and modern poetry, as well as modern prose is solved. The works are compared with a digital portrait, characterized by the distribution of frequency of alphabetic trigrams in them. The effectiveness of applying the -classifier for identifying authors of text is established. It is established that the frequency distribution of trigrams in the works of the Tajik language is an identifier of authorship. The possibilities of the classifier Z.D. Usmanov to recognize the author of the text by the frequency of alphabetic trigrams are investigated. The digital portrait and metric space of the works are designed. Assuming the uniqueness of the author’s work, the threshold values of the metric are set, based on which the classes of “homogeneous” works are defined. The -classifier of discrete random variables, which confirmed high efficiency in identification of authorship of text fragments in works of classical and modern poetry, as well as in modern prose of the Tajik language, is tested for adaptability to recognition of authorship separately. To solve the problem of identifying authors of texts, trigrams are quite acceptable quantitative characteristics. It was also found that with the help of a classifier Z.D. Usmanov it is possible to identify the authors of works in Tajik language by digital portrait.
Key words
Tajik language, text, poetry, prose, frequency, trigram, classifier, identification
References
- Karimov A.A. On the digital portrait of textual information – Polytechnic Bulletin, 2019, 1 (45), Series: Intelligence, Innovation, Investment, P. 7-10.
- Kayumov M.M. On a digital portrait of textual information based on the frequency of punctuation marks – Polytechnic Bulletin, 2019, 1 (45), Series: Intelligence, Innovation, Investment, P. 20-23.
- Kosimov A.A., Bakhteev K.S. On recognition of the author of a text fragment // News of the Academy of Sciences of the Republic of Tajikistan. Department of Physical, Mathematical, Chemical, Geological and Technical Sciences, 2019, № 4 (177).
- Kosimov A.A., Bakhteev K.S. The use of a specific digital portrait to identify authors of works // News of the Academy of Sciences of the Republic of Tajikistan. Department of Physical, Mathematical, Chemical, Geological and Technical Sciences, 2019, № 3 (176), Р. 7-11.
- Usmanov Z.D. Algorithm for tuning the clustering of discrete random variables – Reports of the Academy of Sciences of the Republic of Tajikistan, 2017, vol. 60, № 9, P. 392-397.
- Usmanov Z.D. Classifier of discrete random variables – Reports of the Academy of Sciences of the Republic of Tajikistan, 2017, v. 60, № 7-8, P. 291-300.
- Usmanov Z.D. About one digital portrait of the text and its application – Polytechnic Bulletin, 2019, 3 (47). Series: intelligence, innovation, investment.
- Usmanov Z.D., Kosimov A.A. To the issue of automatic recognition of authorship and styles of works of Tajik-Persian fiction // Doklady of the Academy of Sciences of the Republic of Tajikistan, 2019, vol. 62, № 9.
- Usmanov Z.D., Kosimov A.A. On the applicability of the γ-classifier to recognition of authorship and themes of works of art // Materials of the twenty-second scientific and practical seminar “New information technologies in automated systems”, Moscow, 2019, P. 174-178.
- Usmanov Z.D., Kosimov A.A. On the recognition of authorship of the Tajik text – Reports of the Academy of Sciences of the Republic of Tajikistan, 2016, vol. 59, № 3-4, Р. 114-119.
- Usmanov Z.D., Kosimov A.A. Digital Image of “Shahnameh” (“Books of Kings”) A. Firdausi – Reports of the Academy of Sciences of the Republic of Tajikistan, 2014, v. 57, № 6, P. 471-476.
- Usmanov Z.D., Kosimov A.A. The frequency of bigrams in Tajik literature – Reports of the Academy of Sciences of the Republic of Tajikistan, 2016, vol. 59, № 1-2, P. 28-32.
- Usmanov Z.D., Kosimov A.A. The frequency of letters of Tajik literature – Reports of the Academy of Sciences of the Republic of Tajikistan, 2015, vol. 58, № 2, P. 112 – 115.
- Usmanov Z.D., Soliev O.M. The problem of the layout of characters on a computer keyboard. – Dushanbe: Irfon, 2010, 104 p.
- Khudoyberdiev Kh.A., Kosimov A.A. On recognition of the author of the text based on the frequency of syllables // Reports of the Academy of Sciences of the Republic of Tajikistan, 2019, vol. 62, № 11.
Publication date
2023-10-26