End of training

Files changed (5) hide show

README.md CHANGED Viewed

@@ -1,17 +1,19 @@
 ---
 base_model: google/gemma-2-9b-it
 library_name: transformers
 model_name: gemma-sft-bayesian-lr2.0e-06_assistant_only
 tags:
 - generated_from_trainer
 - sft
 - trl
 licence: license
 ---
 # Model Card for gemma-sft-bayesian-lr2.0e-06_assistant_only
-This model is a fine-tuned version of [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it).
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start

 ---
 base_model: google/gemma-2-9b-it
+datasets: Gabe-Thomp/gemma-bayesian-training
 library_name: transformers
 model_name: gemma-sft-bayesian-lr2.0e-06_assistant_only
 tags:
 - generated_from_trainer
 - sft
 - trl
+- alignment-handbook
 licence: license
 ---
 # Model Card for gemma-sft-bayesian-lr2.0e-06_assistant_only
+This model is a fine-tuned version of [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) on the [Gabe-Thomp/gemma-bayesian-training](https://huggingface.co/datasets/Gabe-Thomp/gemma-bayesian-training) dataset.
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start

all_results.json CHANGED Viewed

@@ -1,4 +1,9 @@
 {
     "total_flos": 42494027431936.0,
     "train_loss": 0.12046583356164026,
     "train_runtime": 4906.2287,

 {
+    "eval_loss": 0.10729417949914932,
+    "eval_runtime": 18.0449,
+    "eval_samples": 240,
+    "eval_samples_per_second": 13.3,
+    "eval_steps_per_second": 0.831,
     "total_flos": 42494027431936.0,
     "train_loss": 0.12046583356164026,
     "train_runtime": 4906.2287,

config.json CHANGED Viewed

@@ -72,6 +72,6 @@
   "sliding_window_size": 4096,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.54.0",
-  "use_cache": false,
   "vocab_size": 256000
 }

   "sliding_window_size": 4096,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.54.0",
+  "use_cache": true,
   "vocab_size": 256000
 }

eval_results.json ADDED Viewed

+{
+    "eval_loss": 0.10729417949914932,
+    "eval_runtime": 18.0449,
+    "eval_samples": 240,
+    "eval_samples_per_second": 13.3,
+    "eval_steps_per_second": 0.831
+}

runs/Jul27_01-16-54_bobu-l40s-1.csail.mit.edu/events.out.tfevents.1753598605.bobu-l40s-1.csail.mit.edu.2571914.1 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:d44f713bd22a0f52cd6e11396b8f46e3355978137425e2aec6ab91a221424675
+size 476