tilyupo
/

t5-small-trivia-gpu-ca2q

text2text-generation

generated_from_keras_callback

Model card Files Files and versions

tilyupo commited on Aug 7, 2023

Commit

3064deb

·

1 Parent(s): b0c5dd8

batch_size=4

Files changed (2) hide show

README.md +8 -20
tf_model.h5 +1 -1

README.md CHANGED Viewed

@@ -15,20 +15,9 @@ probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Train Loss: 1.0929
-- Validation Loss: 1.4052
-- Epoch: 4
-<pre>{'eval_loss': 1.400876522064209,
- 'eval_bleu': 17.847721241494337,
- 'eval_rouge1': 54.52,
- 'eval_rouge2': 31.47,
- 'eval_rougeL': 47.68,
- 'eval_rougeLsum': 47.66,
- 'eval_exact': 0.021183558449130307,
- 'eval_runtime': 239.6854,
- 'eval_samples_per_second': 42.935,
- 'eval_steps_per_second': 1.343}</pre>
 ## Model description
@@ -47,18 +36,17 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- optimizer: {'name': 'Adafactor', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 0.0002, 'beta_2_decay': -0.8, 'epsilon_1': 1e-30, 'epsilon_2': 0.001, 'clip_threshold': 1.0, 'relative_step': False}
 - training_precision: mixed_bfloat16
 ### Training results
 | Train Loss | Validation Loss | Epoch |
 |:----------:|:---------------:|:-----:|
-| 1.7107     | 1.4525          | 0     |
-| 1.4445     | 1.4003          | 1     |
-| 1.2991     | 1.3924          | 2     |
-| 1.1877     | 1.3867          | 3     |
-| 1.0929     | 1.4052          | 4     |
 ### Framework versions

 This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Train Loss: 1.2675
+- Validation Loss: 1.3898
+- Epoch: 3
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- optimizer: {'name': 'Adafactor', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 0.00014285714, 'beta_2_decay': -0.8, 'epsilon_1': 1e-30, 'epsilon_2': 0.001, 'clip_threshold': 1.0, 'relative_step': False}
 - training_precision: mixed_bfloat16
 ### Training results
 | Train Loss | Validation Loss | Epoch |
 |:----------:|:---------------:|:-----:|
+| 1.7429     | 1.4649          | 0     |
+| 1.4976     | 1.4196          | 1     |
+| 1.3663     | 1.3913          | 2     |
+| 1.2675     | 1.3898          | 3     |
 ### Framework versions

tf_model.h5 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:36abc1798475ecfc45790aaa0dd9174bf249afc6c39f8d24ac7c2d4de22bd087
 size 439831352

 version https://git-lfs.github.com/spec/v1
+oid sha256:0f1a28ef8ecc81b3f0ea1a957a5764ce3afef588c411c6026e86bc8c4fcdd477
 size 439831352