d8d8fd3ea3376c3aa3c9f3b5ae367e4d
This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [en-fr] dataset. It achieves the following results on the evaluation set:
- Loss: 1.6234
- Data Size: 1.0
- Epoch Runtime: 502.3375
- Bleu: 11.6107
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 15.1108 | 0 | 40.9449 | 0.1801 |
| No log | 1 | 3177 | 12.2470 | 0.0078 | 44.7462 | 0.1632 |
| 0.2482 | 2 | 6354 | 8.6497 | 0.0156 | 49.0083 | 0.2088 |
| 8.6722 | 3 | 9531 | 5.9141 | 0.0312 | 56.0694 | 0.3121 |
| 5.9354 | 4 | 12708 | 3.9223 | 0.0625 | 69.6133 | 3.2295 |
| 4.5942 | 5 | 15885 | 3.3008 | 0.125 | 98.3453 | 2.8829 |
| 3.8614 | 6 | 19062 | 2.8293 | 0.25 | 156.2967 | 4.0112 |
| 3.3935 | 7 | 22239 | 2.5678 | 0.5 | 266.4938 | 5.1411 |
| 3.0488 | 8.0 | 25416 | 2.3702 | 1.0 | 512.2830 | 6.2382 |
| 2.8234 | 9.0 | 28593 | 2.2452 | 1.0 | 518.0117 | 6.9661 |
| 2.7209 | 10.0 | 31770 | 2.1731 | 1.0 | 515.0344 | 7.4730 |
| 2.6035 | 11.0 | 34947 | 2.1031 | 1.0 | 521.1200 | 7.9125 |
| 2.5055 | 12.0 | 38124 | 2.0647 | 1.0 | 517.4778 | 8.2337 |
| 2.4309 | 13.0 | 41301 | 2.0198 | 1.0 | 516.4580 | 8.5100 |
| 2.3719 | 14.0 | 44478 | 1.9804 | 1.0 | 515.0311 | 8.7866 |
| 2.3223 | 15.0 | 47655 | 1.9540 | 1.0 | 520.2747 | 8.9727 |
| 2.2457 | 16.0 | 50832 | 1.9312 | 1.0 | 516.9345 | 9.1736 |
| 2.2272 | 17.0 | 54009 | 1.9029 | 1.0 | 516.1580 | 9.3535 |
| 2.2188 | 18.0 | 57186 | 1.8812 | 1.0 | 518.6792 | 9.5288 |
| 2.1583 | 19.0 | 60363 | 1.8677 | 1.0 | 519.5041 | 9.6595 |
| 2.0955 | 20.0 | 63540 | 1.8466 | 1.0 | 522.0993 | 9.7797 |
| 2.0809 | 21.0 | 66717 | 1.8308 | 1.0 | 507.6019 | 9.9145 |
| 2.0634 | 22.0 | 69894 | 1.8122 | 1.0 | 499.3597 | 10.0769 |
| 2.0399 | 23.0 | 73071 | 1.8028 | 1.0 | 502.4851 | 10.1335 |
| 2.0418 | 24.0 | 76248 | 1.7894 | 1.0 | 503.3499 | 10.2583 |
| 2.0029 | 25.0 | 79425 | 1.7749 | 1.0 | 503.2419 | 10.3534 |
| 1.9805 | 26.0 | 82602 | 1.7636 | 1.0 | 500.1708 | 10.4152 |
| 1.9643 | 27.0 | 85779 | 1.7547 | 1.0 | 501.3360 | 10.5282 |
| 1.9555 | 28.0 | 88956 | 1.7424 | 1.0 | 502.3105 | 10.5978 |
| 1.9327 | 29.0 | 92133 | 1.7414 | 1.0 | 502.9374 | 10.6517 |
| 1.9168 | 30.0 | 95310 | 1.7332 | 1.0 | 502.5499 | 10.6957 |
| 1.8928 | 31.0 | 98487 | 1.7253 | 1.0 | 500.4714 | 10.7808 |
| 1.8703 | 32.0 | 101664 | 1.7154 | 1.0 | 506.9298 | 10.8111 |
| 1.8468 | 33.0 | 104841 | 1.7063 | 1.0 | 503.4817 | 10.9205 |
| 1.8673 | 34.0 | 108018 | 1.7023 | 1.0 | 505.0646 | 11.0093 |
| 1.8186 | 35.0 | 111195 | 1.6962 | 1.0 | 502.6725 | 11.0144 |
| 1.7871 | 36.0 | 114372 | 1.6844 | 1.0 | 502.7248 | 11.0728 |
| 1.8113 | 37.0 | 117549 | 1.6838 | 1.0 | 505.2796 | 11.1395 |
| 1.7836 | 38.0 | 120726 | 1.6791 | 1.0 | 501.3685 | 11.1425 |
| 1.7603 | 39.0 | 123903 | 1.6714 | 1.0 | 502.4841 | 11.2427 |
| 1.7565 | 40.0 | 127080 | 1.6582 | 1.0 | 501.6685 | 11.2670 |
| 1.7209 | 41.0 | 130257 | 1.6633 | 1.0 | 508.2311 | 11.3440 |
| 1.7196 | 42.0 | 133434 | 1.6549 | 1.0 | 502.7122 | 11.3669 |
| 1.7148 | 43.0 | 136611 | 1.6542 | 1.0 | 499.5352 | 11.3845 |
| 1.6946 | 44.0 | 139788 | 1.6517 | 1.0 | 504.2522 | 11.4227 |
| 1.6908 | 45.0 | 142965 | 1.6450 | 1.0 | 501.0214 | 11.4683 |
| 1.6435 | 46.0 | 146142 | 1.6402 | 1.0 | 500.5124 | 11.5187 |
| 1.6486 | 47.0 | 149319 | 1.6317 | 1.0 | 503.3988 | 11.5462 |
| 1.6063 | 48.0 | 152496 | 1.6335 | 1.0 | 503.3127 | 11.5403 |
| 1.6474 | 49.0 | 155673 | 1.6255 | 1.0 | 503.6295 | 11.6011 |
| 1.6467 | 50.0 | 158850 | 1.6234 | 1.0 | 502.3375 | 11.6107 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for contemmcm/d8d8fd3ea3376c3aa3c9f3b5ae367e4d
Base model
google/umt5-small