ON ENTROPIC INFORMATION MEASURES IN TEXTS OF DIFFERENT LANGUAGES

Authors:Nizamitdinov A.I

Authors

Nizamitdinov A.I. – Doctor of Philosophy, Department of Programming and Information Systems, Polytechnic Institute of Tajik Technical University, Khujand, Republic of Tajikistan, ahlidin@gmail.com.

Annotation

The text of different languages may be regarded as code for defined conceptual objects. From this point of view statistical properties of language, which become main tool of communication, often uses in development of computer science, such as creation of effective binary codes. The language itself can be considered as code for certain conceptual entities. This article describes model and estimation of statistical structure of language. For determination of language weight, the most used tool is probability distribution of different letter combinations. For instance, Russian and Spanish languages consider on the basis of probability distribution of letter for the same semantic content. Therefore, optimal language in means of coding theory determines using Shannon measure of entropy. But entropy information measure cannot give precision estimation of probability distribution. In this case, it uses regression analysis as one of the most popular methods of estimation unknown distribution parameters. The main important result of this article is theoretical aspects of establishing information measure using entropy information and regression analysis.

Key words:

entropy, information measure, regression analysis, Shannon measure, probability distribution.

Language

English

Type of article

economic

Year

2020

Page

34-46

Publication date

September 8, 2023