Arabic Speech Recognition using a Combined Deep Learning Model

Fathiyah  Habeeb; Abduelbaset  Goweder

doi:10.5281/zenodo.14553448

PDF (English)

منشور: Feb 9, 2025

DOI: https://doi.org/10.5281/zenodo.14553448

الكلمات المفتاحية:

Arabic Speech Recognition، Neural Networks، Deep learning، convolutional neural networks (CNNs)، Recurrent Neural Networks (RNNs)

Fathiyah Habeeb

Abduelbaset Goweder

الملخص

Speech recognition is a valuable tool in various industries; however, achieving high accuracy remains a major challenge, despite the rapid growth of the speech recognition market. Arabic in particular lags behind other languages in the field of speech recognition, requiring further attention and development. To address this issue, this research uses deep neural networks to develop an automatic Arabic speech recognition model based on isolated words technology. A hybrid model, which is originally developed by Radfar et al. [1] for English speech recognition, is adopted and adapted to be used for Arabic speech recognition. This model combines the strengths of recurrent neural networks (RNNs), which are critical in speech recognition tasks, with convolutional neural networks (CNNs) to form a hybrid model known as ConvRNN. A specific model for Arabic speech recognition which is referred to as “Arabic_ConvRNN” model has been developed based on “ConvRNN” model. The adopted model is trained using an Arabic speech publicly available dataset of isolated words, along with a custom-generated dataset specially prepared for this research. The performance of the built model has been evaluated using standard metrics, including word error rate (WER), accuracy, precision, recall, and F-measure (also referred to as f1- score). In addition, K-fold cross-validation method has been employed to ensure robustness and generalizability. The results demonstrated that Arabic_ConvRNN model achieved a high accuracy rate of 95.7% on unseen data, with a minimal WER of about 4.3%. These findings highlight the model's effectiveness in accurately recognizing Arabic speech with minimal errors. Comparisons with similar models from previous studies further validated the superiority of Arabic_ConvRNN model. Overall, the Arabic_ConvRNN model shows great promise for applications requiring accurate and efficient Arabic speech recognition. This research contributes to narrowing the gap in Arabic speech recognition technology, offering a robust solution for accurately converting Arabic speech into text.

كيفية الاقتباس

Habeeb, F. ., & Goweder, A. . (2025). Arabic Speech Recognition using a Combined Deep Learning Model. مجلة الأكاديمية للعلوم الأساسية والتطبيقية, 6(3). https://doi.org/10.5281/zenodo.14553448

إصدار

مجلد 6 عدد 3 (2024): المجلد السادس العدد الثالث - ديسمبر 2024

القسم

Articles

الشريط الجانبي للمقالة

محتوى المقالة الرئيسي

الملخص

تفاصيل المقالة