ht-stmini-cls-v7_pretrain_tdso-m0drp0.5trp0.5-cssl-msm-bml
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.1442
 - Loss Spp: 0.0341
 - Loss Gtsp: 0.0711
 - Loss Cssl: 0.5020
 - Loss Msm: 0.5369
 - Macro F1 Gtsp: 0.5013
 
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
 - train_batch_size: 8
 - eval_batch_size: 4
 - seed: 42
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 15035
 - training_steps: 300701
 
Training results
| Training Loss | Epoch | Step | Validation Loss | Loss Spp | Loss Gtsp | Loss Cssl | Loss Msm | Macro F1 Gtsp | 
|---|---|---|---|---|---|---|---|---|
| 25.4786 | 0.0033 | 1000 | 12.5351 | 0.3012 | 0.2506 | 1.9911 | 9.9922 | 0.4111 | 
| 16.2585 | 0.0067 | 2000 | 7.5398 | 0.2786 | 0.2727 | 0.9490 | 6.0395 | 0.3933 | 
| 6.548 | 0.0100 | 3000 | 4.3944 | 0.2436 | 0.0892 | 0.7696 | 3.2920 | 0.4165 | 
| 6.396 | 0.0133 | 4000 | 3.1928 | 0.2099 | 0.0864 | 0.6996 | 2.1969 | 0.4177 | 
| 6.9894 | 0.0166 | 5000 | 2.6320 | 0.1896 | 0.0858 | 0.6529 | 1.7037 | 0.4170 | 
| 6.6321 | 0.0200 | 6000 | 2.1993 | 0.1767 | 0.0865 | 0.6119 | 1.3243 | 0.4172 | 
| 2.4311 | 0.0233 | 7000 | 1.6828 | 0.1419 | 0.0862 | 0.6087 | 0.8461 | 0.4173 | 
| 1.5776 | 0.0266 | 8000 | 1.4201 | 0.1198 | 0.0857 | 0.5702 | 0.6444 | 0.4173 | 
| 1.4376 | 0.0299 | 9000 | 1.4178 | 0.1157 | 0.0822 | 0.5473 | 0.6726 | 0.4131 | 
| 1.3884 | 0.0333 | 10000 | 1.3722 | 0.1060 | 0.0838 | 0.5529 | 0.6294 | 0.4175 | 
| 1.3749 | 0.0366 | 11000 | 1.3748 | 0.1034 | 0.0841 | 0.5443 | 0.6429 | 0.4171 | 
| 1.3883 | 0.0399 | 12000 | 1.3619 | 0.1004 | 0.0852 | 0.5378 | 0.6386 | 0.4174 | 
| 1.3726 | 0.0432 | 13000 | 1.3473 | 0.0961 | 0.0855 | 0.5328 | 0.6329 | 0.4183 | 
| 1.3645 | 0.0466 | 14000 | 1.3459 | 0.0876 | 0.0853 | 0.5347 | 0.6383 | 0.4170 | 
| 1.356 | 0.0499 | 15000 | 1.3317 | 0.0864 | 0.0818 | 0.5435 | 0.6200 | 0.4174 | 
| 1.3474 | 0.0532 | 16000 | 1.3218 | 0.0775 | 0.0831 | 0.5325 | 0.6287 | 0.4171 | 
| 1.3354 | 0.0565 | 17000 | 1.3153 | 0.0739 | 0.0828 | 0.5345 | 0.6242 | 0.4189 | 
| 1.3323 | 0.0599 | 18000 | 1.3057 | 0.0687 | 0.0824 | 0.5321 | 0.6226 | 0.4193 | 
| 1.3236 | 0.0632 | 19000 | 1.2935 | 0.0662 | 0.0816 | 0.5302 | 0.6154 | 0.4209 | 
| 1.3239 | 0.0665 | 20000 | 1.2841 | 0.0628 | 0.0805 | 0.5313 | 0.6095 | 0.4230 | 
| 1.3079 | 0.0698 | 21000 | 1.2861 | 0.0603 | 0.0795 | 0.5268 | 0.6196 | 0.4177 | 
| 1.3053 | 0.0732 | 22000 | 1.2868 | 0.0596 | 0.0815 | 0.5288 | 0.6169 | 0.4228 | 
| 1.3089 | 0.0765 | 23000 | 1.2811 | 0.0583 | 0.0815 | 0.5284 | 0.6130 | 0.4234 | 
| 1.2967 | 0.0798 | 24000 | 1.2688 | 0.0545 | 0.0829 | 0.5260 | 0.6054 | 0.4231 | 
| 1.2816 | 0.0831 | 25000 | 1.2697 | 0.0563 | 0.0810 | 0.5274 | 0.6050 | 0.4241 | 
| 1.29 | 0.0865 | 26000 | 1.2550 | 0.0515 | 0.0808 | 0.5244 | 0.5983 | 0.4251 | 
| 1.2875 | 0.0898 | 27000 | 1.2613 | 0.0510 | 0.0808 | 0.5254 | 0.6041 | 0.4254 | 
| 1.2766 | 0.0931 | 28000 | 1.2481 | 0.0512 | 0.0793 | 0.5250 | 0.5926 | 0.4255 | 
| 1.2655 | 0.0964 | 29000 | 1.2540 | 0.0513 | 0.0803 | 0.5241 | 0.5983 | 0.4283 | 
| 1.2849 | 0.0998 | 30000 | 1.2544 | 0.0488 | 0.0801 | 0.5262 | 0.5992 | 0.4274 | 
| 1.2649 | 0.1031 | 31000 | 1.2494 | 0.0514 | 0.0790 | 0.5229 | 0.5961 | 0.4288 | 
| 1.2474 | 0.1064 | 32000 | 1.2449 | 0.0486 | 0.0791 | 0.5219 | 0.5953 | 0.4301 | 
| 1.2429 | 0.1097 | 33000 | 1.2436 | 0.0500 | 0.0791 | 0.5201 | 0.5945 | 0.4348 | 
| 1.265 | 0.1131 | 34000 | 1.2371 | 0.0467 | 0.0791 | 0.5170 | 0.5943 | 0.4362 | 
| 1.2647 | 0.1164 | 35000 | 1.2322 | 0.0467 | 0.0784 | 0.5201 | 0.5870 | 0.4381 | 
| 1.2627 | 0.1197 | 36000 | 1.2311 | 0.0445 | 0.0777 | 0.5188 | 0.5902 | 0.4383 | 
| 1.2442 | 0.1230 | 37000 | 1.2388 | 0.0494 | 0.0799 | 0.5151 | 0.5944 | 0.4378 | 
| 1.2586 | 0.1264 | 38000 | 1.2308 | 0.0441 | 0.0786 | 0.5198 | 0.5884 | 0.4395 | 
| 1.261 | 0.1297 | 39000 | 1.2256 | 0.0425 | 0.0798 | 0.5183 | 0.5851 | 0.4434 | 
| 1.2297 | 0.1330 | 40000 | 1.2285 | 0.0466 | 0.0781 | 0.5168 | 0.5870 | 0.4404 | 
| 1.2816 | 0.1363 | 41000 | 1.2261 | 0.0460 | 0.0778 | 0.5176 | 0.5847 | 0.4502 | 
| 1.254 | 0.1397 | 42000 | 1.2206 | 0.0429 | 0.0777 | 0.5176 | 0.5825 | 0.4492 | 
| 1.2273 | 0.1430 | 43000 | 1.2280 | 0.0455 | 0.0772 | 0.5200 | 0.5854 | 0.4481 | 
| 1.2517 | 0.1463 | 44000 | 1.2287 | 0.0433 | 0.0746 | 0.5146 | 0.5961 | 0.4414 | 
| 1.2287 | 0.1497 | 45000 | 1.2288 | 0.0427 | 0.0768 | 0.5180 | 0.5912 | 0.4496 | 
| 1.2675 | 0.1530 | 46000 | 1.2199 | 0.0422 | 0.0759 | 0.5160 | 0.5857 | 0.4551 | 
| 1.2269 | 0.1563 | 47000 | 1.2413 | 0.0454 | 0.0753 | 0.5138 | 0.6067 | 0.4392 | 
| 1.2495 | 0.1596 | 48000 | 1.2272 | 0.0420 | 0.0753 | 0.5149 | 0.5950 | 0.4432 | 
| 1.276 | 0.1630 | 49000 | 1.2165 | 0.0409 | 0.0762 | 0.5144 | 0.5851 | 0.4568 | 
| 1.2458 | 0.1663 | 50000 | 1.2283 | 0.0420 | 0.0777 | 0.5197 | 0.5889 | 0.4496 | 
| 1.2219 | 0.1696 | 51000 | 1.2188 | 0.0414 | 0.0762 | 0.5134 | 0.5879 | 0.4512 | 
| 1.2486 | 0.1729 | 52000 | 1.2167 | 0.0404 | 0.0770 | 0.5140 | 0.5854 | 0.4570 | 
| 1.2439 | 0.1763 | 53000 | 1.2208 | 0.0414 | 0.0772 | 0.5136 | 0.5886 | 0.4578 | 
| 1.2215 | 0.1796 | 54000 | 1.2123 | 0.0403 | 0.0759 | 0.5140 | 0.5821 | 0.4538 | 
| 1.2476 | 0.1829 | 55000 | 1.2084 | 0.0388 | 0.0775 | 0.5144 | 0.5776 | 0.4613 | 
| 1.2578 | 0.1862 | 56000 | 1.2052 | 0.0388 | 0.0763 | 0.5111 | 0.5789 | 0.4519 | 
| 1.2549 | 0.1896 | 57000 | 1.2063 | 0.0398 | 0.0766 | 0.5144 | 0.5755 | 0.4582 | 
| 1.2657 | 0.1929 | 58000 | 1.2152 | 0.0398 | 0.0721 | 0.5121 | 0.5912 | 0.4618 | 
| 1.2247 | 0.1962 | 59000 | 1.2111 | 0.0414 | 0.0761 | 0.5106 | 0.5831 | 0.4616 | 
| 1.2271 | 0.1995 | 60000 | 1.1984 | 0.0383 | 0.0763 | 0.5129 | 0.5709 | 0.4661 | 
| 1.2595 | 0.2029 | 61000 | 1.1988 | 0.0389 | 0.0753 | 0.5141 | 0.5705 | 0.4718 | 
| 1.2574 | 0.2062 | 62000 | 1.2008 | 0.0381 | 0.0755 | 0.5135 | 0.5737 | 0.4732 | 
| 1.2366 | 0.2095 | 63000 | 1.1942 | 0.0378 | 0.0750 | 0.5125 | 0.5689 | 0.4654 | 
| 1.2101 | 0.2128 | 64000 | 1.2025 | 0.0402 | 0.0759 | 0.5090 | 0.5775 | 0.4646 | 
| 1.2363 | 0.2162 | 65000 | 1.2041 | 0.0385 | 0.0751 | 0.5136 | 0.5770 | 0.4678 | 
| 1.2394 | 0.2195 | 66000 | 1.1975 | 0.0376 | 0.0762 | 0.5091 | 0.5747 | 0.4748 | 
| 1.2421 | 0.2228 | 67000 | 1.1977 | 0.0376 | 0.0759 | 0.5121 | 0.5722 | 0.4701 | 
| 1.2047 | 0.2261 | 68000 | 1.2016 | 0.0396 | 0.0750 | 0.5125 | 0.5746 | 0.4687 | 
| 1.2493 | 0.2295 | 69000 | 1.2044 | 0.0388 | 0.0742 | 0.5142 | 0.5773 | 0.4706 | 
| 1.2226 | 0.2328 | 70000 | 1.1959 | 0.0388 | 0.0748 | 0.5099 | 0.5724 | 0.4759 | 
| 1.2574 | 0.2361 | 71000 | 1.2055 | 0.0388 | 0.0726 | 0.5098 | 0.5842 | 0.4706 | 
| 1.2271 | 0.2394 | 72000 | 1.1909 | 0.0368 | 0.0757 | 0.5099 | 0.5685 | 0.4806 | 
| 1.231 | 0.2428 | 73000 | 1.2059 | 0.0378 | 0.0726 | 0.5100 | 0.5854 | 0.4640 | 
| 1.2302 | 0.2461 | 74000 | 1.1877 | 0.0384 | 0.0750 | 0.5089 | 0.5654 | 0.4732 | 
| 1.2576 | 0.2494 | 75000 | 1.1868 | 0.0372 | 0.0748 | 0.5086 | 0.5662 | 0.4717 | 
| 1.2041 | 0.2527 | 76000 | 1.1921 | 0.0367 | 0.0745 | 0.5072 | 0.5736 | 0.4795 | 
| 1.2497 | 0.2561 | 77000 | 1.1877 | 0.0380 | 0.0756 | 0.5101 | 0.5641 | 0.4788 | 
| 1.1953 | 0.2594 | 78000 | 1.1923 | 0.0400 | 0.0742 | 0.5120 | 0.5662 | 0.4768 | 
| 1.2234 | 0.2627 | 79000 | 1.1839 | 0.0365 | 0.0746 | 0.5072 | 0.5656 | 0.4759 | 
| 1.2227 | 0.2660 | 80000 | 1.1798 | 0.0365 | 0.0741 | 0.5091 | 0.5602 | 0.4813 | 
| 1.1974 | 0.2694 | 81000 | 1.1897 | 0.0380 | 0.0750 | 0.5087 | 0.5681 | 0.4725 | 
| 1.2508 | 0.2727 | 82000 | 1.1879 | 0.0364 | 0.0736 | 0.5121 | 0.5658 | 0.4755 | 
| 1.1952 | 0.2760 | 83000 | 1.1952 | 0.0408 | 0.0722 | 0.5053 | 0.5769 | 0.4734 | 
| 1.2178 | 0.2793 | 84000 | 1.1968 | 0.0369 | 0.0710 | 0.5079 | 0.5810 | 0.4752 | 
| 1.222 | 0.2827 | 85000 | 1.1815 | 0.0366 | 0.0746 | 0.5072 | 0.5630 | 0.4809 | 
| 1.2114 | 0.2860 | 86000 | 1.1835 | 0.0375 | 0.0755 | 0.5093 | 0.5613 | 0.4751 | 
| 1.2208 | 0.2893 | 87000 | 1.1765 | 0.0365 | 0.0744 | 0.5077 | 0.5579 | 0.4822 | 
| 1.1858 | 0.2926 | 88000 | 1.1774 | 0.0378 | 0.0735 | 0.5077 | 0.5584 | 0.4801 | 
| 1.2516 | 0.2960 | 89000 | 1.1743 | 0.0356 | 0.0741 | 0.5082 | 0.5564 | 0.4891 | 
| 1.2161 | 0.2993 | 90000 | 1.1685 | 0.0352 | 0.0746 | 0.5073 | 0.5515 | 0.4796 | 
| 1.1939 | 0.3026 | 91000 | 1.1786 | 0.0377 | 0.0738 | 0.5090 | 0.5582 | 0.4831 | 
| 1.2163 | 0.3060 | 92000 | 1.1762 | 0.0358 | 0.0736 | 0.5091 | 0.5578 | 0.4812 | 
| 1.1791 | 0.3093 | 93000 | 1.1735 | 0.0369 | 0.0736 | 0.5051 | 0.5578 | 0.4822 | 
| 1.1915 | 0.3126 | 94000 | 1.1766 | 0.0366 | 0.0752 | 0.5053 | 0.5596 | 0.4826 | 
| 1.1816 | 0.3159 | 95000 | 1.1659 | 0.0378 | 0.0725 | 0.5059 | 0.5496 | 0.4894 | 
| 1.2397 | 0.3193 | 96000 | 1.1682 | 0.0344 | 0.0743 | 0.5067 | 0.5527 | 0.4917 | 
| 1.2109 | 0.3226 | 97000 | 1.1723 | 0.0382 | 0.0721 | 0.5067 | 0.5552 | 0.4835 | 
| 1.2132 | 0.3259 | 98000 | 1.1627 | 0.0353 | 0.0748 | 0.5033 | 0.5493 | 0.4915 | 
| 1.2118 | 0.3292 | 99000 | 1.1648 | 0.0339 | 0.0741 | 0.5035 | 0.5532 | 0.4870 | 
| 1.2343 | 0.3326 | 100000 | 1.1622 | 0.0356 | 0.0736 | 0.5046 | 0.5484 | 0.4832 | 
| 1.2079 | 0.3359 | 101000 | 1.1619 | 0.0348 | 0.0734 | 0.5046 | 0.5491 | 0.4837 | 
| 1.212 | 0.3392 | 102000 | 1.1693 | 0.0348 | 0.0733 | 0.5053 | 0.5559 | 0.4903 | 
| 1.2083 | 0.3425 | 103000 | 1.1613 | 0.0337 | 0.0730 | 0.5079 | 0.5467 | 0.4920 | 
| 1.2084 | 0.3459 | 104000 | 1.1569 | 0.0350 | 0.0732 | 0.5046 | 0.5440 | 0.4943 | 
| 1.2386 | 0.3492 | 105000 | 1.1588 | 0.0342 | 0.0733 | 0.5061 | 0.5453 | 0.4927 | 
| 1.1985 | 0.3525 | 106000 | 1.1661 | 0.0343 | 0.0725 | 0.5060 | 0.5534 | 0.4927 | 
| 1.2036 | 0.3558 | 107000 | 1.1727 | 0.0359 | 0.0725 | 0.5116 | 0.5528 | 0.4932 | 
| 1.2088 | 0.3592 | 108000 | 1.1675 | 0.0368 | 0.0734 | 0.5083 | 0.5491 | 0.4857 | 
| 1.2069 | 0.3625 | 109000 | 1.1623 | 0.0359 | 0.0734 | 0.5070 | 0.5460 | 0.4963 | 
| 1.2406 | 0.3658 | 110000 | 1.1562 | 0.0346 | 0.0725 | 0.5053 | 0.5438 | 0.4928 | 
| 1.1732 | 0.3691 | 111000 | 1.1670 | 0.0357 | 0.0718 | 0.5074 | 0.5521 | 0.4881 | 
| 1.203 | 0.3725 | 112000 | 1.1575 | 0.0356 | 0.0740 | 0.5047 | 0.5432 | 0.4890 | 
| 1.238 | 0.3758 | 113000 | 1.1563 | 0.0339 | 0.0741 | 0.5052 | 0.5432 | 0.4972 | 
| 1.2125 | 0.3791 | 114000 | 1.1582 | 0.0364 | 0.0734 | 0.5039 | 0.5446 | 0.4941 | 
| 1.1673 | 0.3824 | 115000 | 1.1631 | 0.0358 | 0.0723 | 0.5066 | 0.5484 | 0.4907 | 
| 1.203 | 0.3858 | 116000 | 1.1551 | 0.0343 | 0.0722 | 0.5043 | 0.5444 | 0.5005 | 
| 1.2063 | 0.3891 | 117000 | 1.1587 | 0.0352 | 0.0717 | 0.5050 | 0.5467 | 0.4924 | 
| 1.1914 | 1.0007 | 118000 | 1.1559 | 0.0348 | 0.0724 | 0.5074 | 0.5413 | 0.4922 | 
| 1.2322 | 1.0041 | 119000 | 1.1460 | 0.0336 | 0.0743 | 0.5036 | 0.5344 | 0.4974 | 
| 1.1952 | 1.0074 | 120000 | 1.1578 | 0.0340 | 0.0729 | 0.5056 | 0.5453 | 0.4905 | 
| 1.2245 | 1.0107 | 121000 | 1.1480 | 0.0334 | 0.0735 | 0.5037 | 0.5374 | 0.4979 | 
| 1.1746 | 1.0140 | 122000 | 1.1528 | 0.0344 | 0.0731 | 0.5029 | 0.5423 | 0.4977 | 
| 1.1634 | 1.0174 | 123000 | 1.1540 | 0.0353 | 0.0721 | 0.5045 | 0.5421 | 0.4944 | 
| 1.2139 | 1.0207 | 124000 | 1.1437 | 0.0336 | 0.0736 | 0.5010 | 0.5354 | 0.5006 | 
| 1.2238 | 1.0240 | 125000 | 1.1559 | 0.0357 | 0.0698 | 0.5005 | 0.5500 | 0.4931 | 
| 1.2144 | 1.0273 | 126000 | 1.1514 | 0.0340 | 0.0726 | 0.5020 | 0.5427 | 0.4981 | 
| 1.1664 | 1.0307 | 127000 | 1.1574 | 0.0351 | 0.0717 | 0.5055 | 0.5451 | 0.4931 | 
| 1.1967 | 1.0340 | 128000 | 1.1557 | 0.0350 | 0.0716 | 0.5040 | 0.5450 | 0.4994 | 
| 1.1893 | 1.0373 | 129000 | 1.1467 | 0.0340 | 0.0722 | 0.5032 | 0.5373 | 0.5038 | 
| 1.2257 | 1.0407 | 130000 | 1.1456 | 0.0324 | 0.0729 | 0.5015 | 0.5387 | 0.4996 | 
| 1.1633 | 1.0440 | 131000 | 1.1502 | 0.0349 | 0.0717 | 0.5048 | 0.5388 | 0.4922 | 
| 1.1643 | 1.0473 | 132000 | 1.1547 | 0.0338 | 0.0722 | 0.5042 | 0.5445 | 0.5009 | 
| 1.2162 | 1.0506 | 133000 | 1.1483 | 0.0358 | 0.0718 | 0.5046 | 0.5361 | 0.5024 | 
| 1.1982 | 1.0540 | 134000 | 1.1410 | 0.0331 | 0.0709 | 0.5026 | 0.5345 | 0.5033 | 
| 1.1829 | 1.0573 | 135000 | 1.1500 | 0.0341 | 0.0721 | 0.5028 | 0.5411 | 0.4970 | 
| 1.1609 | 1.0606 | 136000 | 1.1522 | 0.0349 | 0.0722 | 0.5032 | 0.5419 | 0.5012 | 
| 1.2257 | 1.0639 | 137000 | 1.1443 | 0.0332 | 0.0720 | 0.5045 | 0.5346 | 0.5081 | 
| 1.189 | 1.0673 | 138000 | 1.1445 | 0.0328 | 0.0721 | 0.5022 | 0.5374 | 0.5018 | 
| 1.1563 | 1.0706 | 139000 | 1.1524 | 0.0344 | 0.0717 | 0.5025 | 0.5438 | 0.4982 | 
| 1.19 | 1.0739 | 140000 | 1.1470 | 0.0330 | 0.0747 | 0.5037 | 0.5356 | 0.4990 | 
| 1.1948 | 1.0772 | 141000 | 1.1510 | 0.0340 | 0.0709 | 0.5044 | 0.5416 | 0.4939 | 
| 1.1619 | 1.0806 | 142000 | 1.1476 | 0.0337 | 0.0719 | 0.5021 | 0.5398 | 0.4968 | 
| 1.1613 | 1.0839 | 143000 | 1.1465 | 0.0353 | 0.0726 | 0.5044 | 0.5343 | 0.5017 | 
| 1.1925 | 1.0872 | 144000 | 1.1585 | 0.0342 | 0.0700 | 0.5030 | 0.5513 | 0.4999 | 
Framework versions
- Transformers 4.46.0
 - Pytorch 2.3.1+cu121
 - Datasets 2.20.0
 - Tokenizers 0.20.1
 
- Downloads last month
 - 6
 
	Inference Providers
	NEW
	
	
	This model isn't deployed by any Inference Provider.
	๐
			
		Ask for provider support