|
|
--- |
|
|
language: ["pt"] |
|
|
tags: |
|
|
- automatic-speech-recognition |
|
|
- whisperx |
|
|
- audio |
|
|
- speech-to-text |
|
|
- portuguese |
|
|
license: mit |
|
|
library_name: whisperx |
|
|
base_model: openai/whisper-large-v3 |
|
|
datasets: |
|
|
- your_dataset_name |
|
|
metrics: |
|
|
- wer |
|
|
- cer |
|
|
--- |
|
|
|
|
|
|
|
|
|
|
|
This model is a **fine-tuned [WhisperX](https://github.com/m-bain/whisperX)** variant of `openai/whisper-large-v3`, trained for **Portuguese (pt) (European Portuguese, Brazilian and African and Asian Portuguese varieties)** automatic speech recognition (ASR). |
|
|
From [CAMÕES](https://arxiv.org/pdf/2508.19721) work |
|
|
--- |
|
|
|
|
|
|
|
|
- **Base model:** `openai/whisper-large-v3` |
|
|
- **Architecture:** Transformer encoder–decoder |
|
|
- **Training:** Fine-tuned on around 800 hours of Portuguese speech |
|
|
- **Task:** Transcription (`task="transcribe"`) |
|
|
- **Compute type:** float16 (recommended) |
|
|
|
|
|
|
|
|
|
|
|
```python |
|
|
import whisperx |
|
|
|
|
|
device = "cuda" |
|
|
compute_type = "float16" |
|
|
|
|
|
model = whisperx.load_model( |
|
|
"Miamoto/whisperx-pos-acordo", |
|
|
device=device, |
|
|
compute_type=compute_type, |
|
|
language="pt", |
|
|
task="transcribe" |
|
|
) |