thenlpresearcher's picture
End of training
83a340c verified
metadata
library_name: transformers
license: mit
base_model: ai4bharat/indictrans2-indic-indic-dist-320M
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: iitb_punct_orig_finetuned_eng_Ltn_to_mar_Deva
    results: []

iitb_punct_orig_finetuned_eng_Ltn_to_mar_Deva

This model is a fine-tuned version of ai4bharat/indictrans2-indic-indic-dist-320M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4013
  • Bleu: 9.6977
  • Gen Len: 20.8692

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.5705 0.3373 4000 0.5325 7.2659 20.7523
0.536 0.6746 8000 0.4951 7.9775 20.8722
0.4615 1.0119 12000 0.4757 8.1546 20.8717
0.4643 1.3492 16000 0.4606 8.4812 20.8716
0.4545 1.6865 20000 0.4496 8.6764 20.8743
0.4274 2.0238 24000 0.4421 8.7579 20.8725
0.4254 2.3611 28000 0.4341 8.937 20.8695
0.4089 2.6984 32000 0.4300 8.9973 20.8713
0.3813 3.0357 36000 0.4264 9.1282 20.8735
0.3794 3.3730 40000 0.4221 9.1568 20.8731
0.3887 3.7103 44000 0.4173 9.2067 20.8692
0.3416 4.0476 48000 0.4169 9.3934 20.8712
0.3581 4.3849 52000 0.4131 9.4104 20.8683
0.3596 4.7222 56000 0.4099 9.3756 20.8716
0.3244 5.0594 60000 0.4116 9.4521 20.872
0.3366 5.3967 64000 0.4085 9.4955 20.8652
0.3489 5.7340 68000 0.4056 9.4947 20.8741
0.3235 6.0713 72000 0.4075 9.4984 20.8682
0.3347 6.4086 76000 0.4054 9.5506 20.8708
0.328 6.7459 80000 0.4042 9.6531 20.8686
0.3169 7.0832 84000 0.4059 9.6254 20.8692
0.3137 7.4205 88000 0.4039 9.6441 20.8683
0.3063 7.7578 92000 0.4023 9.6607 20.8689
0.2955 8.0951 96000 0.4031 9.6606 20.8692
0.3082 8.4324 100000 0.4026 9.6643 20.8682
0.3133 8.7697 104000 0.4014 9.6642 20.8692
0.2981 9.1070 108000 0.4028 9.6811 20.8696
0.2857 9.4443 112000 0.4011 9.6914 20.8695
0.2868 9.7816 116000 0.4013 9.6977 20.8692

Framework versions

  • Transformers 4.53.2
  • Pytorch 2.9.0+cu128
  • Datasets 2.21.0
  • Tokenizers 0.21.4