train_apps_42_1767887025

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the apps dataset. It achieves the following results on the evaluation set:

Loss: 0.6885
Num Input Tokens Seen: 699214000

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.7008	0.5	26377	0.7352	34989472
0.4548	1.0	52754	0.7174	69841376
0.6481	1.5	79131	0.7091	104822960
0.644	2.0	105508	0.7044	139857248
0.7858	2.5	131885	0.7012	174789008
1.0128	3.0	158262	0.6984	209796496
0.7108	3.5	184639	0.6964	244685600
0.4746	4.0	211016	0.6950	279729072
0.6035	4.5	237393	0.6939	314692480
0.6679	5.0	263770	0.6926	349648112
0.627	5.5	290147	0.6914	384562352
0.562	6.0	316524	0.6908	419562144
0.7391	6.5	342901	0.6899	454578784
0.5248	7.0	369278	0.6896	489466512
0.6157	7.5	395655	0.6891	524324080
0.4738	8.0	422032	0.6889	559340160
0.6555	8.5	448409	0.6887	594298416
0.7889	9.0	474786	0.6885	629277648
0.7381	9.5	501163	0.6885	664409168
0.7525	10.0	527540	0.6885	699214000

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.1+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 45

Model tree for rbelanec/train_apps_42_1767887025

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2187)

this model