whisper-small-hausa / README.md
a-deji's picture
Fine-tuned Whisper small on 184h Hausa data
3aad511 verified
metadata
library_name: transformers
language:
  - ha
license: apache-2.0
base_model: openai/whisper-small
tags:
  - speech
  - asr
  - hausa
  - whisper
  - generated_from_trainer
datasets:
  - publica-ai/hausa-whisper-dataset
metrics:
  - wer
model-index:
  - name: Whisper Small Hausa Fine-Tuned
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Publica AI Hausa Dataset (184 hours)
          type: publica-ai/hausa-whisper-dataset
        metrics:
          - name: Wer
            type: wer
            value: 9.173059091292252

Whisper Small Hausa Fine-Tuned

This model is a fine-tuned version of openai/whisper-small on the Publica AI Hausa Dataset (184 hours) dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1969
  • Wer: 9.1731

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.1644 3.0183 1000 0.2328 13.9821
0.0645 7.0092 2000 0.2068 12.2836
0.0665 4.8940 3000 0.1559 8.5712
0.0073 6.5253 4000 0.1765 8.9126
0.0013 8.1566 5000 0.1907 8.7752
0.0008 9.7879 6000 0.1969 9.1731

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.9.1+cu128
  • Datasets 4.4.1
  • Tokenizers 0.20.3