45b0a8381455abf27e4627a11e2f2df2

This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [es-nl] dataset. It achieves the following results on the evaluation set:

Loss: 2.3715
Data Size: 1.0
Epoch Runtime: 125.8449
Bleu: 6.9178

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	16.1613	0	11.1373	0.2179
No log	1	806	15.3418	0.0078	13.2162	0.2242
No log	2	1612	14.2340	0.0156	13.5886	0.2406
No log	3	2418	12.3226	0.0312	15.9858	0.2663
0.5028	4	3224	8.6599	0.0625	18.9117	0.3353
8.5516	5	4030	5.9076	0.125	25.8675	0.2173
6.0636	6	4836	4.4264	0.25	39.9435	1.4878
4.9774	7	5642	3.8175	0.5	68.4451	1.4686
4.3314	8.0	6448	3.3833	1.0	124.6386	2.5241
4.0404	9.0	7254	3.2129	1.0	124.6684	3.1086
3.852	10.0	8060	3.0903	1.0	123.8259	3.5210
3.7543	11.0	8866	3.0192	1.0	124.1760	3.8093
3.6092	12.0	9672	2.9572	1.0	123.8549	4.0382
3.4882	13.0	10478	2.8997	1.0	123.8560	4.2709
3.4634	14.0	11284	2.8587	1.0	124.7755	4.4389
3.3405	15.0	12090	2.8200	1.0	124.1348	4.6437
3.3242	16.0	12896	2.7891	1.0	124.2178	4.7456
3.2575	17.0	13702	2.7580	1.0	126.5326	4.8620
3.1646	18.0	14508	2.7379	1.0	124.5723	4.9957
3.2003	19.0	15314	2.7100	1.0	124.2901	5.1351
3.1203	20.0	16120	2.6851	1.0	124.0864	5.2391
3.1237	21.0	16926	2.6673	1.0	125.1696	5.3678
3.074	22.0	17732	2.6450	1.0	124.7209	5.4648
3.0565	23.0	18538	2.6285	1.0	126.6161	5.5100
3.0196	24.0	19344	2.6064	1.0	125.0491	5.6071
2.9592	25.0	20150	2.5933	1.0	125.4321	5.6763
2.9344	26.0	20956	2.5823	1.0	124.1777	5.7775
2.8965	27.0	21762	2.5638	1.0	124.3894	5.8160
2.8433	28.0	22568	2.5533	1.0	125.8283	5.9292
2.827	29.0	23374	2.5407	1.0	125.3586	5.9949
2.809	30.0	24180	2.5272	1.0	127.2397	6.0482
2.8059	31.0	24986	2.5104	1.0	126.4622	6.1045
2.7737	32.0	25792	2.5053	1.0	126.8988	6.1732
2.7757	33.0	26598	2.4957	1.0	124.4287	6.2143
2.7107	34.0	27404	2.4840	1.0	123.9952	6.2781
2.7373	35.0	28210	2.4735	1.0	124.1019	6.3163
2.6762	36.0	29016	2.4649	1.0	125.3603	6.4374
2.6675	37.0	29822	2.4582	1.0	125.1333	6.3972
2.6587	38.0	30628	2.4458	1.0	125.8592	6.4501
2.6667	39.0	31434	2.4383	1.0	125.4466	6.5097
2.5925	40.0	32240	2.4312	1.0	125.0926	6.5206
2.6437	41.0	33046	2.4247	1.0	125.0248	6.5670
2.5847	42.0	33852	2.4254	1.0	125.6922	6.6196
2.5431	43.0	34658	2.4119	1.0	125.0074	6.6407
2.5189	44.0	35464	2.4139	1.0	125.0615	6.7047
2.5496	45.0	36270	2.4026	1.0	124.7657	6.7689
2.5257	46.0	37076	2.3971	1.0	124.6892	6.7871
2.4735	47.0	37882	2.3828	1.0	126.2574	6.8031
2.498	48.0	38688	2.3797	1.0	126.3423	6.8484
2.4705	49.0	39494	2.3811	1.0	126.3897	6.9155
2.448	50.0	40300	2.3715	1.0	125.8449	6.9178

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 3

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/45b0a8381455abf27e4627a11e2f2df2

Base model

google/umt5-small

Finetuned

(45)

this model

contemmcm
/

45b0a8381455abf27e4627a11e2f2df2

45b0a8381455abf27e4627a11e2f2df2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for contemmcm/45b0a8381455abf27e4627a11e2f2df2

Evaluation results