COMPARATIVE ANALYSIS OF THE RECOGNITION SYSTEMS SPHINX AND MOZILLA DEEPSPEECH

Authors

       Khudoiberdiev H.A. – Candidate of Physical and Mathematical Sciences, Head of the  Department of Programming and Information Technologies, Polytechnic Institute of Tajik Technical University, Khujand, Republic of Tajikistan, tajlingvo@gmail.com

    Vositov R.M. – Teacher at the Department of Programming and Information Technologies Polytechnic Institute of Tajik Technical University, Khujand, Republic of Tajikistan, ravshan488889@gmail.com

Annotation

        The article provides a comparative analysis of CMU Sphinx and Mozilla speech recognition, created on the basis of Deep Speech 0.6. Nowadays a lot of speech recognition systems and software products are available to users of computer systems. ach of them are based on existing technologies. The most commonly used technologies are artificial intelligence and machine learning. Recognition of human speech is realized on the basis of the study of grammar, syntax, the structure of sound elements. CMU Sphinx can be used in commercial projects. Thus, the proposed system in the form of API can be used in stand-alone software products. The system supports many platforms, including the Android operating system. Mozilla’s speech recognition system is based on the DeepSpeech engine, which uses machine learning technology. The Mozilla system can be used as an additional platform for their software products. Both systems are popular and open source. The comparison used many criteria, including system structures, availability of detailed documentation, supported recognition languages, and license restrictions. Experiments were also conducted on several speech cases to determine the speed and accuracy of recognition. As a result, for each of the considered systems, recommendations for use were developed with an additional indication of the scope of activity.

Key words

speech recognition, metric, deep speech, Word Recognition Rate (WRR), Word Error Rate (WER), Speed Factor (SF), open source, machine learning.

Language

english

Type

technical

Year

2021

Page

12-13

References

      1. Burkhanova N.M. The budget system of the Russian Federation. – M.: Eksmo, 2007, 32 p.
      2. Burkhanova N.M. Economical geography. Cheat sheets. – M.: Eksmo, 2008, 32 p.
      3. Katasonov V.Yu. America v. Russia. – M.: Book World, 2015, 449 p.
      4. Katasonov V.Yu. Anti-crisis. Survive and conquer. – M.: Algorithm, 2015, 149 p.
      5. Katasonov V.Yu. The battle for the ruble. – M.: Book World, 2015, 288 p.
      6. Klimova M.A. Wage. – M.: Tax Herald, 2008, 320 p.
      7. Klimova M.A. Income tax. – M.: Tax Herald, 2008, 98 p.
      8. Nikanorov P.S. Cooperative activity. – M.: Tax Bulletin, 2008, 320 p.
      9. Nikanorov P.S. Mediation activities. – M.: Tax Bulletin, 2008, 320 p.
      10. Panchenko T.M. Loans and loans. – M.: Tax Bulletin, 2008, 158 p.
      11. Panchenko T.M. Vacation and social benefits. – M.: Tax Bulletin, 2008, 340 p.
      12. Starikov N.V. Geopolitics. How it’s done. – St. Petersburg: Peter, 2014, 368 p.
      13. Starikov N.V. The nationalization of the ruble. – St. Petersburg: Peter, 2011, 169 p.
      14. Usmanov Z.D. N-grams in the recognition of homogeneous texts. – Materials of 20 scientific-practical seminar “New information technologies in automated systems”. – M.: 2017, P. 52 – 54.
      15. Usmanov Z.D. Algorithm for tuning the clustering of discrete random variables. – Reports of the Academy of Sciences of the Republic of Tajikistan, 2017, vol. 60, № 9, P. 392 – 397.
      16. Usmanov Z.D. Classifier of discrete random variables. – Reports of the Academy of Sciences of the Republic of Tajikistan. 2017, vol. 60, № 7 – 8, P. 291 – 300.
      17. Shevchuk D.A. The history of economics. – M.: Author, 2009, 305 p.
      18. Shevchuk D.A. World economy. Lecture notes. – Rostov-on-Don: Phoenix, 2007, 417 p.

Publication date

2023-10-02