whisper-small-hausa / README.md

a-deji

Fine-tuned Whisper small on 184h Hausa data

3aad511 verified 1 day ago

preview code

raw

history blame contribute delete

2.25 kB

metadata

library_name: transformers
language:
  - ha
license: apache-2.0
base_model: openai/whisper-small
tags:
  - speech
  - asr
  - hausa
  - whisper
  - generated_from_trainer
datasets:
  - publica-ai/hausa-whisper-dataset
metrics:
  - wer
model-index:
  - name: Whisper Small Hausa Fine-Tuned
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Publica AI Hausa Dataset (184 hours)
          type: publica-ai/hausa-whisper-dataset
        metrics:
          - name: Wer
            type: wer
            value: 9.173059091292252

Whisper Small Hausa Fine-Tuned

This model is a fine-tuned version of openai/whisper-small on the Publica AI Hausa Dataset (184 hours) dataset. It achieves the following results on the evaluation set:

Loss: 0.1969
Wer: 9.1731

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.1644	3.0183	1000	0.2328	13.9821
0.0645	7.0092	2000	0.2068	12.2836
0.0665	4.8940	3000	0.1559	8.5712
0.0073	6.5253	4000	0.1765	8.9126
0.0013	8.1566	5000	0.1907	8.7752
0.0008	9.7879	6000	0.1969	9.1731

Framework versions

Transformers 4.45.2
Pytorch 2.9.1+cu128
Datasets 4.4.1
Tokenizers 0.20.3