1882e76605485bae427976f8e42669c2
This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [de-fr] dataset. It achieves the following results on the evaluation set:
- Loss: 2.1014
- Data Size: 1.0
- Epoch Runtime: 138.2293
- Bleu: 7.0305
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 15.2819 | 0 | 12.1977 | 0.1157 |
| No log | 1 | 872 | 14.0570 | 0.0078 | 13.7404 | 0.1506 |
| No log | 2 | 1744 | 12.5783 | 0.0156 | 14.3323 | 0.1256 |
| 0.2768 | 3 | 2616 | 10.6469 | 0.0312 | 16.6890 | 0.1587 |
| 0.8402 | 4 | 3488 | 7.5471 | 0.0625 | 20.3658 | 0.2483 |
| 8.7522 | 5 | 4360 | 5.0443 | 0.125 | 28.8127 | 0.6140 |
| 5.6582 | 6 | 5232 | 3.9904 | 0.25 | 44.6548 | 1.9767 |
| 4.526 | 7 | 6104 | 3.3695 | 0.5 | 75.0160 | 1.6519 |
| 3.8898 | 8.0 | 6976 | 2.9352 | 1.0 | 138.2087 | 2.7400 |
| 3.6109 | 9.0 | 7848 | 2.7868 | 1.0 | 138.0720 | 3.2548 |
| 3.4636 | 10.0 | 8720 | 2.6965 | 1.0 | 140.5587 | 3.5898 |
| 3.3409 | 11.0 | 9592 | 2.6354 | 1.0 | 139.4747 | 3.8679 |
| 3.2077 | 12.0 | 10464 | 2.5838 | 1.0 | 139.4930 | 4.0654 |
| 3.1691 | 13.0 | 11336 | 2.5415 | 1.0 | 139.2214 | 4.3071 |
| 3.0689 | 14.0 | 12208 | 2.5128 | 1.0 | 139.4797 | 4.4992 |
| 3.01 | 15.0 | 13080 | 2.4731 | 1.0 | 139.5060 | 4.6686 |
| 2.9516 | 16.0 | 13952 | 2.4451 | 1.0 | 141.0948 | 4.8355 |
| 2.8595 | 17.0 | 14824 | 2.4169 | 1.0 | 140.2607 | 4.9649 |
| 2.8268 | 18.0 | 15696 | 2.3933 | 1.0 | 139.4680 | 5.0970 |
| 2.8319 | 19.0 | 16568 | 2.3745 | 1.0 | 141.9063 | 5.1924 |
| 2.7745 | 20.0 | 17440 | 2.3620 | 1.0 | 139.9242 | 5.3077 |
| 2.7716 | 21.0 | 18312 | 2.3411 | 1.0 | 141.6721 | 5.3964 |
| 2.7218 | 22.0 | 19184 | 2.3173 | 1.0 | 145.8132 | 5.5163 |
| 2.6581 | 23.0 | 20056 | 2.3053 | 1.0 | 145.6531 | 5.5892 |
| 2.6342 | 24.0 | 20928 | 2.2878 | 1.0 | 146.5871 | 5.6683 |
| 2.5885 | 25.0 | 21800 | 2.2814 | 1.0 | 146.2511 | 5.7737 |
| 2.5772 | 26.0 | 22672 | 2.2685 | 1.0 | 145.9510 | 5.8343 |
| 2.5563 | 27.0 | 23544 | 2.2579 | 1.0 | 144.9519 | 5.9332 |
| 2.5135 | 28.0 | 24416 | 2.2527 | 1.0 | 145.6331 | 6.0058 |
| 2.4737 | 29.0 | 25288 | 2.2346 | 1.0 | 145.9113 | 6.0698 |
| 2.509 | 30.0 | 26160 | 2.2268 | 1.0 | 144.7014 | 6.1406 |
| 2.473 | 31.0 | 27032 | 2.2114 | 1.0 | 144.6324 | 6.2047 |
| 2.4271 | 32.0 | 27904 | 2.2087 | 1.0 | 144.0634 | 6.2594 |
| 2.4121 | 33.0 | 28776 | 2.1971 | 1.0 | 144.3472 | 6.3215 |
| 2.3709 | 34.0 | 29648 | 2.1846 | 1.0 | 144.6133 | 6.3849 |
| 2.3713 | 35.0 | 30520 | 2.1893 | 1.0 | 144.6684 | 6.4302 |
| 2.3675 | 36.0 | 31392 | 2.1726 | 1.0 | 145.7501 | 6.4806 |
| 2.3349 | 37.0 | 32264 | 2.1665 | 1.0 | 144.5169 | 6.5537 |
| 2.3164 | 38.0 | 33136 | 2.1596 | 1.0 | 144.5251 | 6.5644 |
| 2.2996 | 39.0 | 34008 | 2.1523 | 1.0 | 144.7580 | 6.6312 |
| 2.2524 | 40.0 | 34880 | 2.1485 | 1.0 | 145.3244 | 6.6413 |
| 2.2727 | 41.0 | 35752 | 2.1414 | 1.0 | 145.4038 | 6.6935 |
| 2.239 | 42.0 | 36624 | 2.1388 | 1.0 | 143.3335 | 6.7214 |
| 2.2324 | 43.0 | 37496 | 2.1298 | 1.0 | 143.7957 | 6.7895 |
| 2.2502 | 44.0 | 38368 | 2.1236 | 1.0 | 144.5139 | 6.8473 |
| 2.2186 | 45.0 | 39240 | 2.1221 | 1.0 | 145.2851 | 6.8749 |
| 2.1839 | 46.0 | 40112 | 2.1160 | 1.0 | 139.0024 | 6.8615 |
| 2.1547 | 47.0 | 40984 | 2.1130 | 1.0 | 138.4339 | 6.9186 |
| 2.1398 | 48.0 | 41856 | 2.1049 | 1.0 | 138.8303 | 6.9564 |
| 2.151 | 49.0 | 42728 | 2.1017 | 1.0 | 137.8470 | 6.9989 |
| 2.0868 | 50.0 | 43600 | 2.1014 | 1.0 | 138.2293 | 7.0305 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/1882e76605485bae427976f8e42669c2
Base model
google/umt5-small