Model save

Browse files

Files changed (5) hide show

README.md +79 -0
intent_report_test.txt +75 -0
model.safetensors +1 -1
model_predict_test.csv +0 -0
slot_report_test.txt +59 -0

README.md ADDED Viewed

	@@ -0,0 +1,79 @@

+---
+library_name: transformers
+license: apache-2.0
+base_model: answerdotai/ModernBERT-large
+tags:
+- generated_from_trainer
+model-index:
+- name: ModernBERT-large_massive_modernbert_large_crf_v1
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# ModernBERT-large_massive_modernbert_large_crf_v1
+This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 15.5718
+- Slot P: 0.5398
+- Slot R: 0.6408
+- Slot F1: 0.5860
+- Slot Exact Match: 0.6001
+- Intent Acc: 0.7831
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 128
+- eval_batch_size: 128
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 256
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.06
+- num_epochs: 30
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Slot P | Slot R | Slot F1 | Slot Exact Match | Intent Acc |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:-------:|:----------------:|:----------:|
+| No log        | 1.0   | 45   | 43.3614         | 0.0    | 0.0    | 0.0     | 0.3178           | 0.0821     |
+| 160.2669      | 2.0   | 90   | 27.2292         | 0.3143 | 0.2269 | 0.2635  | 0.3586           | 0.2548     |
+| 66.654        | 3.0   | 135  | 19.2474         | 0.4379 | 0.4    | 0.4181  | 0.4481           | 0.4575     |
+| 38.629        | 4.0   | 180  | 15.3625         | 0.4023 | 0.5408 | 0.4614  | 0.4801           | 0.5903     |
+| 23.3498       | 5.0   | 225  | 12.4194         | 0.4446 | 0.5706 | 0.4998  | 0.5411           | 0.6695     |
+| 12.7922       | 6.0   | 270  | 12.3227         | 0.5013 | 0.5980 | 0.5454  | 0.5691           | 0.6990     |
+| 7.8613        | 7.0   | 315  | 12.8060         | 0.4926 | 0.6    | 0.5410  | 0.5642           | 0.7324     |
+| 5.4037        | 8.0   | 360  | 12.9247         | 0.5086 | 0.6294 | 0.5626  | 0.5809           | 0.7388     |
+| 3.6892        | 9.0   | 405  | 13.9871         | 0.5260 | 0.6343 | 0.5751  | 0.5986           | 0.7605     |
+| 2.6797        | 10.0  | 450  | 14.0965         | 0.5562 | 0.6204 | 0.5865  | 0.6011           | 0.7742     |
+| 2.6797        | 11.0  | 495  | 13.8520         | 0.5105 | 0.6398 | 0.5679  | 0.5775           | 0.7698     |
+| 2.0031        | 12.0  | 540  | 15.0858         | 0.5491 | 0.6289 | 0.5863  | 0.6080           | 0.7698     |
+| 1.3894        | 13.0  | 585  | 15.5718         | 0.5398 | 0.6408 | 0.5860  | 0.6001           | 0.7831     |
+### Framework versions
+- Transformers 4.55.0
+- Pytorch 2.7.0+cu126
+- Datasets 3.6.0
+- Tokenizers 0.21.4

intent_report_test.txt ADDED Viewed

	@@ -0,0 +1,75 @@

+              precision    recall  f1-score   support
+           0       0.88      0.88      0.88        88
+           1       0.81      0.81      0.81        36
+           2       0.97      0.86      0.91        35
+           3       0.82      0.77      0.79        35
+           4       0.96      0.85      0.90        26
+           5       0.00      0.00      0.00         1
+           6       0.62      0.60      0.61        43
+           7       0.00      0.00      0.00         4
+           8       1.00      0.78      0.88        18
+           9       0.83      0.74      0.78        72
+          10       0.94      0.82      0.88        39
+          11       0.68      1.00      0.81        15
+          12       0.39      0.56      0.46       169
+          13       0.91      0.89      0.90       156
+          14       0.48      0.77      0.59        13
+          15       0.62      0.67      0.64        12
+          16       0.68      0.77      0.72        22
+          17       0.73      0.62      0.67        26
+          18       0.59      0.89      0.71        27
+          19       0.78      0.68      0.72        31
+          20       0.78      0.71      0.74        41
+          21       0.82      0.82      0.82        39
+          22       0.73      0.79      0.76       124
+          23       0.87      0.79      0.83        34
+          24       1.00      0.80      0.89        10
+          25       0.88      0.79      0.83        19
+          26       0.94      0.77      0.85        57
+          27       0.80      0.64      0.71        25
+          28       0.00      0.00      0.00         6
+          29       1.00      0.17      0.29         6
+          30       0.76      0.91      0.83        67
+          31       0.82      0.43      0.56        21
+          32       0.66      0.65      0.65       126
+          33       0.93      0.89      0.91       114
+          34       0.95      0.77      0.85        26
+          35       0.82      0.82      0.82        11
+          36       0.82      0.78      0.80        72
+          37       0.00      0.00      0.00         0
+          38       0.82      0.60      0.69        15
+          39       0.88      0.60      0.71        25
+          40       1.00      0.84      0.91        43
+          41       0.00      0.00      0.00         3
+          42       0.81      0.59      0.68        51
+          43       0.60      0.50      0.55        36
+          44       0.96      0.84      0.90       119
+          45       0.79      0.88      0.83       176
+          46       0.60      0.84      0.70        32
+          47       0.91      0.78      0.84        81
+          48       0.87      0.80      0.84        41
+          49       0.68      0.56      0.61       141
+          50       0.75      0.88      0.81       209
+          51       0.77      0.77      0.77        35
+          52       0.83      0.90      0.86        21
+          53       0.83      0.85      0.84        52
+          54       0.76      0.96      0.85        23
+          55       0.65      0.55      0.59        20
+          56       1.00      0.94      0.97        36
+          57       0.77      0.77      0.77        35
+          58       0.81      0.87      0.84        63
+          59       0.84      0.71      0.77        51
+    accuracy                           0.77      2974
+   macro avg       0.74      0.69      0.70      2974
+weighted avg       0.78      0.77      0.77      2974
+Confusion matrix:
+[[77  0  0 ...  0  0  0]
+ [ 0 29  0 ...  0  0  0]
+ [ 0  0 30 ...  0  0  0]
+ ...
+ [ 0  0  0 ... 27  0  0]
+ [ 0  0  0 ...  0 55  0]
+ [ 0  0  1 ...  0  0 36]]

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ff2ceb23a501416e10e7b17241b73b8854267231d2974775cc0f70d763224b1e
 size 1579896944

 version https://git-lfs.github.com/spec/v1
+oid sha256:5405b4ac523b1a3fdfb6883905472c126086750b196ed444289ea8103ffcb920
 size 1579896944

model_predict_test.csv ADDED Viewed

The diff for this file is too large to render. See raw diff

slot_report_test.txt ADDED Viewed

	@@ -0,0 +1,59 @@

+                      precision    recall  f1-score   support
+          alarm_type       0.00      0.00      0.00         2
+            app_name       0.18      0.40      0.25         5
+         artist_name       0.49      0.64      0.55        61
+    audiobook_author       1.00      0.40      0.57         5
+      audiobook_name       0.65      0.57      0.60        23
+       business_name       0.56      0.64      0.60        92
+       business_type       0.27      0.35      0.31        31
+       change_amount       0.25      0.22      0.24         9
+         coffee_type       0.20      0.25      0.22         4
+          color_type       0.55      0.62      0.58        26
+        cooking_type       0.00      0.00      0.00         8
+       currency_name       0.62      0.74      0.67        50
+                date       0.71      0.85      0.78       415
+     definition_word       0.53      0.49      0.51        51
+         device_type       0.66      0.68      0.67        57
+          drink_type       0.00      0.00      0.00         1
+       email_address       0.60      0.67      0.63         9
+        email_folder       0.43      0.60      0.50         5
+          event_name       0.50      0.59      0.54       260
+           food_type       0.33      0.51      0.40        72
+           game_name       0.68      0.50      0.58        26
+   general_frequency       0.56      0.50      0.53        20
+         house_place       0.63      0.67      0.65        58
+          ingredient       0.00      0.00      0.00         6
+           joke_type       0.25      0.18      0.21        11
+           list_name       0.54      0.62      0.58        61
+           meal_type       0.39      0.72      0.51        18
+          media_type       0.64      0.80      0.71       128
+          movie_name       0.00      0.00      0.00         2
+          movie_type       0.00      0.00      0.00         3
+         music_album       0.00      0.00      0.00         1
+    music_descriptor       0.00      0.00      0.00         7
+         music_genre       0.67      0.76      0.71        50
+          news_topic       0.36      0.48      0.41        52
+          order_type       0.61      0.85      0.71        20
+              person       0.56      0.62      0.59       216
+       personal_info       0.31      0.57      0.40        14
+          place_name       0.63      0.59      0.61       281
+      player_setting       0.34      0.40      0.37        40
+       playlist_name       0.19      0.20      0.19        15
+  podcast_descriptor       0.37      0.29      0.33        24
+        podcast_name       0.25      0.35      0.29        17
+          radio_name       0.30      0.36      0.33        33
+            relation       0.36      0.44      0.40        59
+           song_name       0.15      0.21      0.18        39
+                time       0.54      0.61      0.58       191
+           time_zone       0.56      0.38      0.45        13
+           timeofday       0.67      0.67      0.67        60
+    transport_agency       0.80      0.89      0.84         9
+transport_descriptor       0.00      0.00      0.00         2
+      transport_name       0.50      0.25      0.33         4
+      transport_type       0.73      0.80      0.76        65
+  weather_descriptor       0.55      0.52      0.54        82
+           micro avg       0.55      0.62      0.58      2813
+           macro avg       0.41      0.44      0.42      2813
+        weighted avg       0.55      0.62      0.58      2813