45b0a8381455abf27e4627a11e2f2df2
This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [es-nl] dataset. It achieves the following results on the evaluation set:
- Loss: 2.3715
- Data Size: 1.0
- Epoch Runtime: 125.8449
- Bleu: 6.9178
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 16.1613 | 0 | 11.1373 | 0.2179 |
| No log | 1 | 806 | 15.3418 | 0.0078 | 13.2162 | 0.2242 |
| No log | 2 | 1612 | 14.2340 | 0.0156 | 13.5886 | 0.2406 |
| No log | 3 | 2418 | 12.3226 | 0.0312 | 15.9858 | 0.2663 |
| 0.5028 | 4 | 3224 | 8.6599 | 0.0625 | 18.9117 | 0.3353 |
| 8.5516 | 5 | 4030 | 5.9076 | 0.125 | 25.8675 | 0.2173 |
| 6.0636 | 6 | 4836 | 4.4264 | 0.25 | 39.9435 | 1.4878 |
| 4.9774 | 7 | 5642 | 3.8175 | 0.5 | 68.4451 | 1.4686 |
| 4.3314 | 8.0 | 6448 | 3.3833 | 1.0 | 124.6386 | 2.5241 |
| 4.0404 | 9.0 | 7254 | 3.2129 | 1.0 | 124.6684 | 3.1086 |
| 3.852 | 10.0 | 8060 | 3.0903 | 1.0 | 123.8259 | 3.5210 |
| 3.7543 | 11.0 | 8866 | 3.0192 | 1.0 | 124.1760 | 3.8093 |
| 3.6092 | 12.0 | 9672 | 2.9572 | 1.0 | 123.8549 | 4.0382 |
| 3.4882 | 13.0 | 10478 | 2.8997 | 1.0 | 123.8560 | 4.2709 |
| 3.4634 | 14.0 | 11284 | 2.8587 | 1.0 | 124.7755 | 4.4389 |
| 3.3405 | 15.0 | 12090 | 2.8200 | 1.0 | 124.1348 | 4.6437 |
| 3.3242 | 16.0 | 12896 | 2.7891 | 1.0 | 124.2178 | 4.7456 |
| 3.2575 | 17.0 | 13702 | 2.7580 | 1.0 | 126.5326 | 4.8620 |
| 3.1646 | 18.0 | 14508 | 2.7379 | 1.0 | 124.5723 | 4.9957 |
| 3.2003 | 19.0 | 15314 | 2.7100 | 1.0 | 124.2901 | 5.1351 |
| 3.1203 | 20.0 | 16120 | 2.6851 | 1.0 | 124.0864 | 5.2391 |
| 3.1237 | 21.0 | 16926 | 2.6673 | 1.0 | 125.1696 | 5.3678 |
| 3.074 | 22.0 | 17732 | 2.6450 | 1.0 | 124.7209 | 5.4648 |
| 3.0565 | 23.0 | 18538 | 2.6285 | 1.0 | 126.6161 | 5.5100 |
| 3.0196 | 24.0 | 19344 | 2.6064 | 1.0 | 125.0491 | 5.6071 |
| 2.9592 | 25.0 | 20150 | 2.5933 | 1.0 | 125.4321 | 5.6763 |
| 2.9344 | 26.0 | 20956 | 2.5823 | 1.0 | 124.1777 | 5.7775 |
| 2.8965 | 27.0 | 21762 | 2.5638 | 1.0 | 124.3894 | 5.8160 |
| 2.8433 | 28.0 | 22568 | 2.5533 | 1.0 | 125.8283 | 5.9292 |
| 2.827 | 29.0 | 23374 | 2.5407 | 1.0 | 125.3586 | 5.9949 |
| 2.809 | 30.0 | 24180 | 2.5272 | 1.0 | 127.2397 | 6.0482 |
| 2.8059 | 31.0 | 24986 | 2.5104 | 1.0 | 126.4622 | 6.1045 |
| 2.7737 | 32.0 | 25792 | 2.5053 | 1.0 | 126.8988 | 6.1732 |
| 2.7757 | 33.0 | 26598 | 2.4957 | 1.0 | 124.4287 | 6.2143 |
| 2.7107 | 34.0 | 27404 | 2.4840 | 1.0 | 123.9952 | 6.2781 |
| 2.7373 | 35.0 | 28210 | 2.4735 | 1.0 | 124.1019 | 6.3163 |
| 2.6762 | 36.0 | 29016 | 2.4649 | 1.0 | 125.3603 | 6.4374 |
| 2.6675 | 37.0 | 29822 | 2.4582 | 1.0 | 125.1333 | 6.3972 |
| 2.6587 | 38.0 | 30628 | 2.4458 | 1.0 | 125.8592 | 6.4501 |
| 2.6667 | 39.0 | 31434 | 2.4383 | 1.0 | 125.4466 | 6.5097 |
| 2.5925 | 40.0 | 32240 | 2.4312 | 1.0 | 125.0926 | 6.5206 |
| 2.6437 | 41.0 | 33046 | 2.4247 | 1.0 | 125.0248 | 6.5670 |
| 2.5847 | 42.0 | 33852 | 2.4254 | 1.0 | 125.6922 | 6.6196 |
| 2.5431 | 43.0 | 34658 | 2.4119 | 1.0 | 125.0074 | 6.6407 |
| 2.5189 | 44.0 | 35464 | 2.4139 | 1.0 | 125.0615 | 6.7047 |
| 2.5496 | 45.0 | 36270 | 2.4026 | 1.0 | 124.7657 | 6.7689 |
| 2.5257 | 46.0 | 37076 | 2.3971 | 1.0 | 124.6892 | 6.7871 |
| 2.4735 | 47.0 | 37882 | 2.3828 | 1.0 | 126.2574 | 6.8031 |
| 2.498 | 48.0 | 38688 | 2.3797 | 1.0 | 126.3423 | 6.8484 |
| 2.4705 | 49.0 | 39494 | 2.3811 | 1.0 | 126.3897 | 6.9155 |
| 2.448 | 50.0 | 40300 | 2.3715 | 1.0 | 125.8449 | 6.9178 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/45b0a8381455abf27e4627a11e2f2df2
Base model
google/umt5-small