This project focuses on fine-tuning OpenAI’s Whisper model on Danish speech data and evaluating its performance across multiple benchmark datasets.
The goal is to improve speech-to-text performance for underrepresented languages (specifically Danish) by adapting a pretrained ASR (Automatic Speech Recognition) model to domain-specific audio data.
The pipeline consists of audio preprocessing, feature extraction, model fine-tuning on HPC infrastructure, and evaluation across multiple datasets. Training jobs were executed on a distributed compute environment to handle large-scale audio processing efficiently.