iitb_punct_orig_finetuned_eng_Ltn_to_mar_Deva

This model is a fine-tuned version of ai4bharat/indictrans2-indic-indic-dist-320M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4013
  • Bleu: 9.6977
  • Gen Len: 20.8692

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.5705 0.3373 4000 0.5325 7.2659 20.7523
0.536 0.6746 8000 0.4951 7.9775 20.8722
0.4615 1.0119 12000 0.4757 8.1546 20.8717
0.4643 1.3492 16000 0.4606 8.4812 20.8716
0.4545 1.6865 20000 0.4496 8.6764 20.8743
0.4274 2.0238 24000 0.4421 8.7579 20.8725
0.4254 2.3611 28000 0.4341 8.937 20.8695
0.4089 2.6984 32000 0.4300 8.9973 20.8713
0.3813 3.0357 36000 0.4264 9.1282 20.8735
0.3794 3.3730 40000 0.4221 9.1568 20.8731
0.3887 3.7103 44000 0.4173 9.2067 20.8692
0.3416 4.0476 48000 0.4169 9.3934 20.8712
0.3581 4.3849 52000 0.4131 9.4104 20.8683
0.3596 4.7222 56000 0.4099 9.3756 20.8716
0.3244 5.0594 60000 0.4116 9.4521 20.872
0.3366 5.3967 64000 0.4085 9.4955 20.8652
0.3489 5.7340 68000 0.4056 9.4947 20.8741
0.3235 6.0713 72000 0.4075 9.4984 20.8682
0.3347 6.4086 76000 0.4054 9.5506 20.8708
0.328 6.7459 80000 0.4042 9.6531 20.8686
0.3169 7.0832 84000 0.4059 9.6254 20.8692
0.3137 7.4205 88000 0.4039 9.6441 20.8683
0.3063 7.7578 92000 0.4023 9.6607 20.8689
0.2955 8.0951 96000 0.4031 9.6606 20.8692
0.3082 8.4324 100000 0.4026 9.6643 20.8682
0.3133 8.7697 104000 0.4014 9.6642 20.8692
0.2981 9.1070 108000 0.4028 9.6811 20.8696
0.2857 9.4443 112000 0.4011 9.6914 20.8695
0.2868 9.7816 116000 0.4013 9.6977 20.8692

Framework versions

  • Transformers 4.53.2
  • Pytorch 2.9.0+cu128
  • Datasets 2.21.0
  • Tokenizers 0.21.4
Downloads last month
12
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for thenlpresearcher/iitb_punct_orig_finetuned_eng_Ltn_to_mar_Deva

Finetuned
(3)
this model