APPLICATION OF THESAURUS IN THE LINGUISTIC TASKS OF THE TAJIK LANGUAGE: APPROACHES AND IMPLEMENTATION

Authors

Ashurova Sh.N. – Senior Lecturer, Department of Programming and Information Systems, Polytechnic Institute of Tajik Technical University, Khujand, Republic of Tajikistan sh.nurulloevna@gmail.com
Nazarov A.A. – Senior Lecturer, Department of Programming and Information Systems, Polytechnic Institute of Tajik Technical University, Khujand, Republic of Tajikistan, n.abdusamad@gmail.com
Khudoiberdiev Kh.A. – Candidate of Physical and Mathematical Sciences, Head of the Department, Department of Programming and Information Systems, Polytechnic Institute of Tajik Technical University, Khujand, Republic of Tajikistan tajlingvo@gmail.com

Abstract

The article addresses the issue of applying a thesaurus to the tasks of computer linguistics in the Tajik language. An attempt has been made to summarize approaches and implement the application of a thesaurus in linguistic tasks. It is noted that computational linguistics is a field of knowledge associated with the automatic processing of information presented in natural language. Its central scientific problem is the modeling of the process of understanding the meaning of text and speech synthesis based on formalized representations of meaning. These problems arise in solving applied tasks of automatic speech analysis and synthesis, machine translation, interaction with natural language systems, document classification and summarization, and full-text search. A thesaurus is a dictionary that indicates semantic relationships between lexical units, such as synonymy, antonymy, hyponymy/hyperonymy, meronymy/partonymy. It allows for the identification of word meaning not only through definition but also through connections with other concepts. Thesaurus can be applied for the description of subject areas, machine translation, spelling and grammar checking, information retrieval, document indexing, and semantic text analysis. The conclusion states that the creation of a thesaurus for the Tajik language based on the WordNet project will contribute to the development of computational linguistics of the Tajik language, the solution of automated text processing tasks, and the improvement of information retrieval efficiency and machine translation. The application of a thesaurus for the Tajik language opens up broad prospects for the development of computational linguistics and the solution of many practical tasks of automatic text processing in the Tajik language. The development of a Tajik language thesaurus following the WordNet project model is a promising direction that will contribute to the development of computational linguistics of the Tajik language and the creation of effective intelligent text information processing systems in the Tajik language.

Keywords

thesaurus, computational linguistics, semantic relationships, automatic text processing, machine translation.

References

1. Belonogov G.G., Kalinin Yu.P., Khoroshilov A.A. Computer linguistics and advanced information technologies. M., 2004.

2. Getman, Ivan Mikhailovich. Thesaurus as a tool of modern linguistics: Author’s abstract. dis. doc. Philol. Sci. – Kyiv: Institute of Linguistics, 1991. – P. 34.

3. Nguyen M. Kh., Adzhiev A. S. Description and use of thesauri in information systems, approaches and implementation // Electronic libraries. – 2004. – T. 7, No. 1. – P. 16-45. — ISSN 1562-5419.

Publish date

2026-03-17