Authors: Nizamitdinov A.I, Inomov B.B.
Auhtors
Nizamitdinov A.I. – Doctor of Philosophy (PhD), Department of Programming and Information Technologies, Polytechnic Institute of Tajik Technical University, Khujand, Republic of Tajikistan, ahlidin@gmail.com.
Inomov B.B. – Phd student of specialty 6D070300- Information systems, Department of Programming and Information Technologies, Polytechnic Institute of Tajik Technical University, Khujand, Republic of Tajikistan behruzinomov@gmail.com.
Annotation
This article gives an overview of the available machine learning algorithms for classification problems, in particular in the problems of classifying texts of different language contexts. Text classification is one of the main tasks of computer linguistics. This direction has several main tasks, such as determining the thematic affiliation of texts, the author of the text, the emotional coloring of statements, etc. To ensure information and public safety in social networks, information sites, analysis of content containing illegal information is of great importance in telecommunication networks. The use of machine learning algorithms to solve text classification problems is a fairly common task today, since program complexes based on these algorithms have a rather high rating indicator in comparison with other classification approaches. The application and comparison of classification algorithms is a rather difficult task, since different input data can give different results. Therefore, software algorithms must be trained and tested on the same data sets.
Key words
algorithm, machine learning, text classification, data analysis
Language
English
Type of article
scientific
Year
2020
Page
34-46
References
- Aggarwal C. and Zhai C. (2012) A survey of text classification algorithms. Springer, P. 163—222.
- Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani. (2013). An introduction to statistical learning: with applications in R. New York: Springer.
- Jimenez, S. (2014) Text Classification and Clustering with WEKA.
- Kayumov M.M. (2019) On the effectiveness of using digital portraits based on high-frequency punctuation marks for recognizing authors of works, News of Polytechnic. Serie: Intellect, Innovation, Investments ,4 (48), Р. 23-26.
- Korde V. and Mahender C. (2012) Text classification and classifiers: A survey. International Journal of Artificial Intelligence & Applications (IJAIA), 3 (2), P. 85—99.
- Maksudov Kh.T., Inomov B.B (2019) The comparison of classification algorithms by machine learning methods: case study of scientific texts by specialties, News of Polytechnic. Serie: Intellect, Innovation, Investments, 4 (48), Р. 34-38.
- Mukhsinzoda M. Y., Soliev O. M. (2019) Generating new Tajik national names using artificial neural networks, News of Polytechnic. Serie: Intellect, Innovation, Investments, 4 (48), Р. 18-23.
- Nazarov A.A. (2019) An automatic synthesis of Tajik word forms of adjective, News of Polytechnic. Serie: Intellect, Innovation, Investments,4(48), 16-18.
- Niharika S., Latha V. and Lavanya, D. (2012). A Survey on Text Categorization. International Journal of Computer Trends and Technology, volume 3, Issue 1.
- Pandey U. and Chakraverty S.A (2011) Review of Text Classification Approaches for E-mail Management. IACSIT International Journal of Engineering and Technology, 3 (2).
- Patra A. and Singh D. (2013). A Survey Report on Text Classification with Different Term Weighing Methods and Comparison between Classification Algorithms. International Journal of Computer Applications, Volume 75, № 7, Р. 14 – 18.
- Wilcox A. and Hripcsak G. (1999) Classification algorithms applied to narrative reports. P. 455.
Publication date
05 Jun 2023