Model Card for Whisper Small Turkish
This model is a fine-tuned version of openai/whisper-small on the Mozilla Common Voice 23.0 Turkish dataset.
Key Features & Robustness
Standard ASR models often fail in noisy environments. This model tackles that problem by applying JIT (Just-In-Time) Augmentation during training.
The model was exposed to the following synthetic degradations dynamically during the training loop:
- Gaussian Noise Injection: Simulating background static and environmental noise.
- Time Stretching: Randomly speeding up or slowing down speech (0.8x - 1.2x) to handle fast/slow speakers.
- Frequency Masking: Simulating codec loss or bad microphone quality.
Result: The model demonstrates high resilience to noise, maintaining transcription accuracy even when the input audio has a low Signal-to-Noise Ratio (SNR).
Performance
| Metric | Condition | Performance |
|---|---|---|
| WER (Word Error Rate) | Clean Audio | ~20% |
| WER (Word Error Rate) | Noisy/Distorted Audio | ~20% (Robust) |
WandB
Usage
You can use this model directly with the Hugging Face pipeline.
import torch
from transformers import pipeline
# 1. Load the pipeline
device = "cuda" if torch.cuda.is_available() else "cpu"
pipe = pipeline(
"automatic-speech-recognition",
model="ogulcanakca/whisper-small-tr",
device=device,
generate_kwargs={
"length_penalty": 1.5,
"no_repeat_ngram_size": 2,
"language": "turkish",
"task": "transcribe",
"compression_ratio_threshold": 1.35
}
)
# 2. Transcribe audio (can be a file path or URL)
# The model handles resampling automatically.
result = pipe("path_to_your_audio.mp3")
print(result["text"])
Parameter Details
per_device_train_batch_size=64gradient_accumulation_steps=1gradient_checkpointing=Falsefp16=Truedataloader_num_workers=8dataloader_pin_memory=Truelearning_rate=1e-5num_train_epochs=5per_device_eval_batch_size=32predict_with_generate=Truegeneration_max_length=225save_steps=1000eval_steps=1000warmup_steps=500logging_steps=10
The training lasted approximately 67 minutes on the A100 GPU (80 gb).
- Downloads last month
- 38
Model tree for ogulcanakca/whisper-small-tr
Evaluation results
- Wer on Common Voice 23.0 (Turkish)self-reported20.000