---
library_name: transformers
license: mit
base_model: microsoft/speecht5_tts
tags:
- generated_from_trainer
model-index:
- name: iou-chapter-audio-dataset-force-aligned-speecht5
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# iou-chapter-audio-dataset-force-aligned-speecht5

This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4800

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 3407
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 4000
- training_steps: 40000
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch    | Step  | Validation Loss |
|:-------------:|:--------:|:-----:|:---------------:|
| 0.5387        | 5.2918   | 1000  | 0.5152          |
| 0.493         | 10.5836  | 2000  | 0.4955          |
| 0.4935        | 15.8753  | 3000  | 0.4885          |
| 0.4846        | 21.1645  | 4000  | 0.4863          |
| 0.4717        | 26.4562  | 5000  | 0.4825          |
| 0.4532        | 31.7480  | 6000  | 0.4804          |
| 0.4841        | 37.0371  | 7000  | 0.4802          |
| 0.458         | 42.3289  | 8000  | 0.4791          |
| 0.4454        | 47.6207  | 9000  | 0.4822          |
| 0.4461        | 52.9125  | 10000 | 0.4790          |
| 0.4362        | 58.2016  | 11000 | 0.4789          |
| 0.4301        | 63.4934  | 12000 | 0.4789          |
| 0.43          | 68.7851  | 13000 | 0.4806          |
| 0.4392        | 74.0743  | 14000 | 0.4796          |
| 0.4355        | 79.3660  | 15000 | 0.4797          |
| 0.4273        | 84.6578  | 16000 | 0.4778          |
| 0.4324        | 89.9496  | 17000 | 0.4808          |
| 0.4239        | 95.2387  | 18000 | 0.4792          |
| 0.4174        | 100.5305 | 19000 | 0.4786          |
| 0.4206        | 105.8223 | 20000 | 0.4777          |
| 0.4104        | 111.1114 | 21000 | 0.4784          |
| 0.4121        | 116.4032 | 22000 | 0.4797          |
| 0.4087        | 121.6950 | 23000 | 0.4800          |
| 0.4115        | 126.9867 | 24000 | 0.4788          |
| 0.405         | 132.2759 | 25000 | 0.4799          |
| 0.4091        | 137.5676 | 26000 | 0.4795          |
| 0.4165        | 142.8594 | 27000 | 0.4799          |
| 0.4059        | 148.1485 | 28000 | 0.4792          |
| 0.4092        | 153.4403 | 29000 | 0.4797          |
| 0.4006        | 158.7321 | 30000 | 0.4791          |
| 0.4033        | 164.0212 | 31000 | 0.4789          |
| 0.3929        | 169.3130 | 32000 | 0.4796          |
| 0.4024        | 174.6048 | 33000 | 0.4803          |
| 0.3988        | 179.8966 | 34000 | 0.4785          |
| 0.3965        | 185.1857 | 35000 | 0.4792          |
| 0.3914        | 190.4775 | 36000 | 0.4795          |
| 0.3967        | 195.7692 | 37000 | 0.4811          |
| 0.3994        | 201.0584 | 38000 | 0.4800          |
| 0.4019        | 206.3501 | 39000 | 0.4805          |
| 0.4005        | 211.6419 | 40000 | 0.4800          |


### Framework versions

- Transformers 4.57.1
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1