COMPARATIVE ANALYSIS OF LOCAL AND CLOUD-BASED SPEECH RECOGNITION MODELS FOR THE KAZAKH LANGUAGE
Published:
2025-12-22Section:
Information and communication technologiesArticle language:
RussianKeywords:
automatic speech recognition, Kazakh language, agglutinative languages, local ASR models, cloud-based speech recognition systems, error analysis, low-resource languages, morphological variability.Abstract
The development of automatic speech recognition systems for the Kazakh language remains a pressing challenge due to limited linguistic resources and the high morphological complexity typical of agglutinative languages. The aim of this study is to conduct a comparative analysis of local and cloud-based speech recognition models that are most accessible for practical use in educational and engineering contexts. The research employs a Kazakh speech corpus with variation in speaker age, gender, and utterance length. A multi-level evaluation was performed, including word- and character-level error rates as well as morphological and phonetic error analysis. The results indicate substantial differences among the models: recognition accuracy differs by dozens of percentage points between the models, with most errors arising from morpheme boundaries and inflectional forms. The practical significance of the study lies in identifying optimal ASR solutions suitable for environments with unstable network infrastructure. The work also outlines directions for further development, including corpus expansion and improvement of post-processing techniques.
License
Copyright (c) 2025 Вестник ВКТУ

This work is licensed under a Creative Commons Attribution 4.0 International License.