Website of D. Serikbayev EKTU
  • Font Size
    16px
    Website Colors
    Images

COMPARATIVE ANALYSIS OF LOCAL AND CLOUD-BASED SPEECH RECOGNITION MODELS FOR THE KAZAKH LANGUAGE

Authors

Name Affiliation
Manat Ospanov Костанайский региональный университет имени Ахмета Байтурсынова, Костанай, Казахстан

Published:

2025-12-22

Section:

Information and communication technologies

Article language:

Russian

Keywords:

automatic speech recognition, Kazakh language, agglutinative languages, local ASR models, cloud-based speech recognition systems, error analysis, low-resource languages, morphological variability.

Abstract

The development of automatic speech recognition systems for the Kazakh language remains a pressing challenge due to limited linguistic resources and the high morphological complexity typical of agglutinative languages. The aim of this study is to conduct a comparative analysis of local and cloud-based speech recognition models that are most accessible for practical use in educational and engineering contexts. The research employs a Kazakh speech corpus with variation in speaker age, gender, and utterance length. A multi-level evaluation was performed, including word- and character-level error rates as well as morphological and phonetic error analysis. The results indicate substantial differences among the models: recognition accuracy differs by dozens of percentage points between the models, with most errors arising from morpheme boundaries and inflectional forms. The practical significance of the study lies in identifying optimal ASR solutions suitable for environments with unstable network infrastructure. The work also outlines directions for further development, including corpus expansion and improvement of post-processing techniques. 

Article cover image
Ospanov, M. (2025). COMPARATIVE ANALYSIS OF LOCAL AND CLOUD-BASED SPEECH RECOGNITION MODELS FOR THE KAZAKH LANGUAGE . Вестник ВКТУ, (4). Retrieved from https://vestnik.ektu.kz/index.php/vestnik/article/view/1387