iitb_punct_robust_finetuned_eng_Ltn_to_mar_Deva

This model is a fine-tuned version of ai4bharat/indictrans2-indic-indic-dist-320M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.2961
Bleu: 11.6647
Gen Len: 20.8736

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
0.5741	0.1686	4000	0.5529	6.7552	20.8383
0.5435	0.3373	8000	0.5100	7.5393	20.8667
0.5229	0.5059	12000	0.4833	7.8246	20.8734
0.5043	0.6746	16000	0.4643	8.2329	20.8722
0.4747	0.8432	20000	0.4483	8.3543	20.8815
0.4312	1.0119	24000	0.4363	8.5709	20.8743
0.4308	1.1805	28000	0.4247	8.6275	20.8734
0.42	1.3492	32000	0.4155	8.8384	20.8771
0.4058	1.5178	36000	0.4073	9.1532	20.8738
0.3961	1.6865	40000	0.4012	9.2407	20.8757
0.407	1.8551	44000	0.3924	9.3208	20.8752
0.3656	2.0238	48000	0.3878	9.4126	20.8728
0.359	2.1924	52000	0.3837	9.5937	20.876
0.3872	2.3611	56000	0.3782	9.5685	20.8756
0.3524	2.5297	60000	0.3739	9.7006	20.8752
0.3623	2.6984	64000	0.3680	9.8153	20.8779
0.36	2.8670	68000	0.3631	9.9461	20.8761
0.3368	3.0357	72000	0.3598	9.9375	20.8726
0.3389	3.2043	76000	0.3564	9.9956	20.8772
0.33	3.3730	80000	0.3525	10.0473	20.8766
0.3296	3.5416	84000	0.3497	10.2221	20.8756
0.3188	3.7103	88000	0.3465	10.2435	20.8766
0.3241	3.8789	92000	0.3429	10.4269	20.8717
0.3137	4.0476	96000	0.3433	10.4148	20.8722
0.3099	4.2162	100000	0.3374	10.4363	20.87
0.2952	4.3849	104000	0.3358	10.532	20.8752
0.313	4.5535	108000	0.3351	10.5208	20.874
0.3097	4.7222	112000	0.3312	10.6111	20.8723
0.3089	4.8908	116000	0.3275	10.7281	20.8767
0.2738	5.0594	120000	0.3275	10.7572	20.8718
0.2804	5.2281	124000	0.3254	10.8103	20.873
0.2899	5.3967	128000	0.3236	10.8792	20.8729
0.2875	5.5654	132000	0.3215	10.8924	20.8732
0.2793	5.7340	136000	0.3189	10.9224	20.873
0.2812	5.9027	140000	0.3165	11.0079	20.8745
0.2644	6.0713	144000	0.3166	11.0533	20.874
0.2599	6.2400	148000	0.3153	11.1479	20.8743
0.264	6.4086	152000	0.3127	11.159	20.8742
0.2709	6.5773	156000	0.3118	11.2196	20.8756
0.2601	6.7459	160000	0.3102	11.2334	20.8761
0.2685	6.9146	164000	0.3076	11.2818	20.8722
0.2456	7.0832	168000	0.3087	11.2891	20.874
0.2584	7.2519	172000	0.3060	11.3233	20.8721
0.2544	7.4205	176000	0.3055	11.3388	20.8726
0.2485	7.5892	180000	0.3043	11.421	20.8753
0.2438	7.7578	184000	0.3024	11.4621	20.8719
0.2672	7.9265	188000	0.3015	11.4965	20.8749
0.2504	8.0951	192000	0.3015	11.4844	20.874
0.2488	8.2638	196000	0.3006	11.5213	20.8758
0.2394	8.4324	200000	0.3000	11.5506	20.8751
0.2483	8.6011	204000	0.2996	11.586	20.8738
0.2452	8.7697	208000	0.2980	11.6098	20.8727
0.2504	8.9384	212000	0.2975	11.5714	20.8716
0.241	9.1070	216000	0.2973	11.5894	20.8714
0.2361	9.2757	220000	0.2971	11.6313	20.8729
0.2342	9.4443	224000	0.2970	11.616	20.8726
0.2372	9.6130	228000	0.2965	11.6442	20.8717
0.2464	9.7816	232000	0.2961	11.656	20.873
0.2401	9.9502	236000	0.2961	11.6647	20.8736

Framework versions

Transformers 4.53.2
Pytorch 2.9.0+cu128
Datasets 2.21.0
Tokenizers 0.21.4

Downloads last month: 13

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for thenlpresearcher/iitb_punct_robust_finetuned_eng_Ltn_to_mar_Deva

Base model

ai4bharat/indictrans2-indic-indic-dist-320M

Finetuned

(3)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard