Abstract: We present a novel Automatic Speech Recognition (ASR) dataset for the Oromo language, a widely spoken language in Ethiopia and neighboring regions. The dataset was collected through a ...
Sagalee dataset released under the CC BY-NC 4.0 International license, a summary of the license can be found here, and the full license can be found here. finetune_whisper.py is used to fine tune ...