ArabAlg: A new Dataset for Arabic Speech Command Recognition for Machine Learning Applications
No Thumbnail Available
Date
2024
Journal Title
Journal ISSN
Volume Title
Publisher
University of Bahrain
Abstract
Automatic Speech Recognition (ASR) systems have witnessed significant advancements in recent years, thanks to the emergence of deep learning techniques and the availability of large speech datasets in various languages. With the increasing demand for Arabic voice-enabled technologies, the availability of a high-quality and representative dataset for the Arabic language becomes crucial. This paper presents the development of a new dataset called ArabAlg, specifically designed for Arabic Speech Command Recognition (ASCR), to support the integration of Arabic voice recognition systems into smart devices in the Internet of Things (IoT). This research focuses on collecting and annotating a diverse range of Arabic speech commands, encompassing various domains and applications. The dataset construction process involves recording and preprocessing several utterances from native Arabic speakers. To ensure precision and reliability, quality control measures are implemented during data collection and annotation. The resulting dataset provides a valuable resource for training and evaluating ASCR systems tailored for Arabic speakers using Machine Learning and Deep Learning.
Description
Keywords
Arabic Speech Command Recognition, Automatic Speech Recognition, Dataset for limited vocabulary, Internet of Things, Machine learning, Smart devices
