svjack's picture
Upload folder using huggingface_hub
eca60d4 verified
raw
history blame
178 kB
05/18/2024 07:21:54 - INFO - transformers.tokenization_utils_base - loading file tokenizer.model from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/tokenizer.model
05/18/2024 07:21:54 - INFO - transformers.tokenization_utils_base - loading file tokenizer.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/tokenizer.json
05/18/2024 07:21:54 - INFO - transformers.tokenization_utils_base - loading file added_tokens.json from cache at None
05/18/2024 07:21:54 - INFO - transformers.tokenization_utils_base - loading file special_tokens_map.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/special_tokens_map.json
05/18/2024 07:21:54 - INFO - transformers.tokenization_utils_base - loading file tokenizer_config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/tokenizer_config.json
05/18/2024 07:21:54 - INFO - llamafactory.data.template - Add pad token: </s>
05/18/2024 07:21:54 - INFO - llamafactory.data.loader - Loading dataset svjack/generated_chat_0_4M_sharegpt_system_human_gpt...
05/18/2024 07:22:49 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 07:22:50 - INFO - transformers.configuration_utils - Model config MistralConfig {
"_name_or_path": "mistralai/Mistral-7B-Instruct-v0.2",
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 07:22:50 - INFO - llamafactory.model.utils.quantization - Quantizing model to 4 bit.
05/18/2024 07:22:50 - INFO - transformers.modeling_utils - loading weights file model.safetensors from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/model.safetensors.index.json
05/18/2024 07:22:50 - INFO - transformers.modeling_utils - Instantiating MistralForCausalLM model under default dtype torch.float16.
05/18/2024 07:22:50 - INFO - transformers.generation.configuration_utils - Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
05/18/2024 07:23:08 - INFO - transformers.modeling_utils - All model checkpoint weights were used when initializing MistralForCausalLM.
05/18/2024 07:23:08 - INFO - transformers.modeling_utils - All the weights of MistralForCausalLM were initialized from the model checkpoint at mistralai/Mistral-7B-Instruct-v0.2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use MistralForCausalLM for predictions without further training.
05/18/2024 07:23:08 - INFO - transformers.generation.configuration_utils - loading configuration file generation_config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/generation_config.json
05/18/2024 07:23:08 - INFO - transformers.generation.configuration_utils - Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2
}
05/18/2024 07:23:09 - INFO - llamafactory.model.utils.checkpointing - Gradient checkpointing enabled.
05/18/2024 07:23:09 - INFO - llamafactory.model.utils.attention - Using torch SDPA for faster training and inference.
05/18/2024 07:23:09 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
05/18/2024 07:23:09 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
05/18/2024 07:23:09 - INFO - llamafactory.model.loader - trainable params: 3407872 || all params: 7245139968 || trainable%: 0.0470
05/18/2024 07:23:09 - WARNING - accelerate.utils.other - Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
05/18/2024 07:23:09 - INFO - transformers.trainer - Using auto half precision backend
05/18/2024 07:23:09 - INFO - transformers.trainer - ***** Running training *****
05/18/2024 07:23:09 - INFO - transformers.trainer - Num examples = 100,000
05/18/2024 07:23:09 - INFO - transformers.trainer - Num Epochs = 3
05/18/2024 07:23:09 - INFO - transformers.trainer - Instantaneous batch size per device = 1
05/18/2024 07:23:09 - INFO - transformers.trainer - Total train batch size (w. parallel, distributed & accumulation) = 8
05/18/2024 07:23:09 - INFO - transformers.trainer - Gradient Accumulation steps = 8
05/18/2024 07:23:09 - INFO - transformers.trainer - Total optimization steps = 37,500
05/18/2024 07:23:09 - INFO - transformers.trainer - Number of trainable parameters = 3,407,872
05/18/2024 07:24:07 - INFO - llamafactory.extras.callbacks - {'loss': 2.4535, 'learning_rate': 5.0000e-05, 'epoch': 0.00}
05/18/2024 07:25:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.9042, 'learning_rate': 5.0000e-05, 'epoch': 0.00}
05/18/2024 07:25:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.6398, 'learning_rate': 5.0000e-05, 'epoch': 0.00}
05/18/2024 07:26:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.3978, 'learning_rate': 5.0000e-05, 'epoch': 0.00}
05/18/2024 07:27:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.3416, 'learning_rate': 5.0000e-05, 'epoch': 0.00}
05/18/2024 07:28:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.4384, 'learning_rate': 5.0000e-05, 'epoch': 0.00}
05/18/2024 07:29:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.3529, 'learning_rate': 5.0000e-05, 'epoch': 0.00}
05/18/2024 07:30:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.3142, 'learning_rate': 5.0000e-05, 'epoch': 0.00}
05/18/2024 07:31:35 - INFO - llamafactory.extras.callbacks - {'loss': 1.3162, 'learning_rate': 5.0000e-05, 'epoch': 0.00}
05/18/2024 07:32:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.3443, 'learning_rate': 5.0000e-05, 'epoch': 0.00}
05/18/2024 07:33:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.2582, 'learning_rate': 5.0000e-05, 'epoch': 0.00}
05/18/2024 07:34:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.3449, 'learning_rate': 5.0000e-05, 'epoch': 0.00}
05/18/2024 07:35:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2836, 'learning_rate': 5.0000e-05, 'epoch': 0.01}
05/18/2024 07:36:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2437, 'learning_rate': 5.0000e-05, 'epoch': 0.01}
05/18/2024 07:37:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.2318, 'learning_rate': 5.0000e-05, 'epoch': 0.01}
05/18/2024 07:38:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.3253, 'learning_rate': 4.9999e-05, 'epoch': 0.01}
05/18/2024 07:39:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.3195, 'learning_rate': 4.9999e-05, 'epoch': 0.01}
05/18/2024 07:40:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.2973, 'learning_rate': 4.9999e-05, 'epoch': 0.01}
05/18/2024 07:41:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.2883, 'learning_rate': 4.9999e-05, 'epoch': 0.01}
05/18/2024 07:42:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.3460, 'learning_rate': 4.9999e-05, 'epoch': 0.01}
05/18/2024 07:42:06 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-100
05/18/2024 07:42:07 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 07:42:07 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 07:42:07 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-100/tokenizer_config.json
05/18/2024 07:42:07 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-100/special_tokens_map.json
05/18/2024 07:43:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.3119, 'learning_rate': 4.9999e-05, 'epoch': 0.01}
05/18/2024 07:44:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.2937, 'learning_rate': 4.9999e-05, 'epoch': 0.01}
05/18/2024 07:44:59 - INFO - llamafactory.extras.callbacks - {'loss': 1.2376, 'learning_rate': 4.9999e-05, 'epoch': 0.01}
05/18/2024 07:45:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.3009, 'learning_rate': 4.9999e-05, 'epoch': 0.01}
05/18/2024 07:46:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.2827, 'learning_rate': 4.9999e-05, 'epoch': 0.01}
05/18/2024 07:47:49 - INFO - llamafactory.extras.callbacks - {'loss': 1.3158, 'learning_rate': 4.9999e-05, 'epoch': 0.01}
05/18/2024 07:48:43 - INFO - llamafactory.extras.callbacks - {'loss': 1.2554, 'learning_rate': 4.9998e-05, 'epoch': 0.01}
05/18/2024 07:49:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.2788, 'learning_rate': 4.9998e-05, 'epoch': 0.01}
05/18/2024 07:50:39 - INFO - llamafactory.extras.callbacks - {'loss': 1.2200, 'learning_rate': 4.9998e-05, 'epoch': 0.01}
05/18/2024 07:51:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.2836, 'learning_rate': 4.9998e-05, 'epoch': 0.01}
05/18/2024 07:52:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.2498, 'learning_rate': 4.9998e-05, 'epoch': 0.01}
05/18/2024 07:53:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2757, 'learning_rate': 4.9998e-05, 'epoch': 0.01}
05/18/2024 07:54:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.2378, 'learning_rate': 4.9998e-05, 'epoch': 0.01}
05/18/2024 07:55:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.2126, 'learning_rate': 4.9997e-05, 'epoch': 0.01}
05/18/2024 07:56:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2379, 'learning_rate': 4.9997e-05, 'epoch': 0.01}
05/18/2024 07:57:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2449, 'learning_rate': 4.9997e-05, 'epoch': 0.01}
05/18/2024 07:58:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.2934, 'learning_rate': 4.9997e-05, 'epoch': 0.01}
05/18/2024 07:59:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.2313, 'learning_rate': 4.9997e-05, 'epoch': 0.02}
05/18/2024 08:00:12 - INFO - llamafactory.extras.callbacks - {'loss': 1.2328, 'learning_rate': 4.9997e-05, 'epoch': 0.02}
05/18/2024 08:01:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.2497, 'learning_rate': 4.9996e-05, 'epoch': 0.02}
05/18/2024 08:01:08 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-200
05/18/2024 08:01:09 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 08:01:09 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 08:01:09 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-200/tokenizer_config.json
05/18/2024 08:01:09 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-200/special_tokens_map.json
05/18/2024 08:02:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.2877, 'learning_rate': 4.9996e-05, 'epoch': 0.02}
05/18/2024 08:03:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.2884, 'learning_rate': 4.9996e-05, 'epoch': 0.02}
05/18/2024 08:04:03 - INFO - llamafactory.extras.callbacks - {'loss': 1.2742, 'learning_rate': 4.9996e-05, 'epoch': 0.02}
05/18/2024 08:05:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.2408, 'learning_rate': 4.9996e-05, 'epoch': 0.02}
05/18/2024 08:05:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.1997, 'learning_rate': 4.9996e-05, 'epoch': 0.02}
05/18/2024 08:06:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.2414, 'learning_rate': 4.9995e-05, 'epoch': 0.02}
05/18/2024 08:07:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.2848, 'learning_rate': 4.9995e-05, 'epoch': 0.02}
05/18/2024 08:08:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.2626, 'learning_rate': 4.9995e-05, 'epoch': 0.02}
05/18/2024 08:09:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.2582, 'learning_rate': 4.9995e-05, 'epoch': 0.02}
05/18/2024 08:10:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.2702, 'learning_rate': 4.9995e-05, 'epoch': 0.02}
05/18/2024 08:11:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.1929, 'learning_rate': 4.9994e-05, 'epoch': 0.02}
05/18/2024 08:12:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.2231, 'learning_rate': 4.9994e-05, 'epoch': 0.02}
05/18/2024 08:13:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.2621, 'learning_rate': 4.9994e-05, 'epoch': 0.02}
05/18/2024 08:14:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2124, 'learning_rate': 4.9994e-05, 'epoch': 0.02}
05/18/2024 08:15:24 - INFO - llamafactory.extras.callbacks - {'loss': 1.2780, 'learning_rate': 4.9993e-05, 'epoch': 0.02}
05/18/2024 08:16:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.2608, 'learning_rate': 4.9993e-05, 'epoch': 0.02}
05/18/2024 08:17:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.2849, 'learning_rate': 4.9993e-05, 'epoch': 0.02}
05/18/2024 08:18:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.2251, 'learning_rate': 4.9993e-05, 'epoch': 0.02}
05/18/2024 08:19:04 - INFO - llamafactory.extras.callbacks - {'loss': 1.2721, 'learning_rate': 4.9992e-05, 'epoch': 0.02}
05/18/2024 08:20:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.2494, 'learning_rate': 4.9992e-05, 'epoch': 0.02}
05/18/2024 08:20:00 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-300
05/18/2024 08:20:00 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 08:20:00 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 08:20:00 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-300/tokenizer_config.json
05/18/2024 08:20:00 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-300/special_tokens_map.json
05/18/2024 08:20:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.2140, 'learning_rate': 4.9992e-05, 'epoch': 0.02}
05/18/2024 08:21:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.2649, 'learning_rate': 4.9992e-05, 'epoch': 0.02}
05/18/2024 08:22:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.1784, 'learning_rate': 4.9991e-05, 'epoch': 0.03}
05/18/2024 08:23:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.2838, 'learning_rate': 4.9991e-05, 'epoch': 0.03}
05/18/2024 08:24:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.3073, 'learning_rate': 4.9991e-05, 'epoch': 0.03}
05/18/2024 08:25:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.2914, 'learning_rate': 4.9990e-05, 'epoch': 0.03}
05/18/2024 08:26:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.3132, 'learning_rate': 4.9990e-05, 'epoch': 0.03}
05/18/2024 08:27:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.1934, 'learning_rate': 4.9990e-05, 'epoch': 0.03}
05/18/2024 08:28:23 - INFO - llamafactory.extras.callbacks - {'loss': 1.2486, 'learning_rate': 4.9990e-05, 'epoch': 0.03}
05/18/2024 08:29:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.2049, 'learning_rate': 4.9989e-05, 'epoch': 0.03}
05/18/2024 08:30:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.2323, 'learning_rate': 4.9989e-05, 'epoch': 0.03}
05/18/2024 08:31:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.2439, 'learning_rate': 4.9989e-05, 'epoch': 0.03}
05/18/2024 08:32:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2807, 'learning_rate': 4.9988e-05, 'epoch': 0.03}
05/18/2024 08:33:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.2605, 'learning_rate': 4.9988e-05, 'epoch': 0.03}
05/18/2024 08:34:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.2185, 'learning_rate': 4.9988e-05, 'epoch': 0.03}
05/18/2024 08:35:03 - INFO - llamafactory.extras.callbacks - {'loss': 1.2511, 'learning_rate': 4.9987e-05, 'epoch': 0.03}
05/18/2024 08:36:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.2266, 'learning_rate': 4.9987e-05, 'epoch': 0.03}
05/18/2024 08:36:56 - INFO - llamafactory.extras.callbacks - {'loss': 1.2363, 'learning_rate': 4.9987e-05, 'epoch': 0.03}
05/18/2024 08:37:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.2034, 'learning_rate': 4.9986e-05, 'epoch': 0.03}
05/18/2024 08:38:49 - INFO - llamafactory.extras.callbacks - {'loss': 1.2578, 'learning_rate': 4.9986e-05, 'epoch': 0.03}
05/18/2024 08:38:49 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-400
05/18/2024 08:38:50 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 08:38:50 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 08:38:50 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-400/tokenizer_config.json
05/18/2024 08:38:50 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-400/special_tokens_map.json
05/18/2024 08:39:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.2506, 'learning_rate': 4.9986e-05, 'epoch': 0.03}
05/18/2024 08:40:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.3237, 'learning_rate': 4.9985e-05, 'epoch': 0.03}
05/18/2024 08:41:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.2538, 'learning_rate': 4.9985e-05, 'epoch': 0.03}
05/18/2024 08:42:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.3154, 'learning_rate': 4.9985e-05, 'epoch': 0.03}
05/18/2024 08:43:32 - INFO - llamafactory.extras.callbacks - {'loss': 1.2145, 'learning_rate': 4.9984e-05, 'epoch': 0.03}
05/18/2024 08:44:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.2926, 'learning_rate': 4.9984e-05, 'epoch': 0.03}
05/18/2024 08:45:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2439, 'learning_rate': 4.9983e-05, 'epoch': 0.03}
05/18/2024 08:46:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.2516, 'learning_rate': 4.9983e-05, 'epoch': 0.04}
05/18/2024 08:47:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.2549, 'learning_rate': 4.9983e-05, 'epoch': 0.04}
05/18/2024 08:48:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.2780, 'learning_rate': 4.9982e-05, 'epoch': 0.04}
05/18/2024 08:49:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2406, 'learning_rate': 4.9982e-05, 'epoch': 0.04}
05/18/2024 08:50:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.2514, 'learning_rate': 4.9981e-05, 'epoch': 0.04}
05/18/2024 08:51:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.2641, 'learning_rate': 4.9981e-05, 'epoch': 0.04}
05/18/2024 08:52:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.2743, 'learning_rate': 4.9981e-05, 'epoch': 0.04}
05/18/2024 08:52:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.2765, 'learning_rate': 4.9980e-05, 'epoch': 0.04}
05/18/2024 08:53:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.2237, 'learning_rate': 4.9980e-05, 'epoch': 0.04}
05/18/2024 08:54:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.2165, 'learning_rate': 4.9979e-05, 'epoch': 0.04}
05/18/2024 08:55:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.2629, 'learning_rate': 4.9979e-05, 'epoch': 0.04}
05/18/2024 08:56:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.2974, 'learning_rate': 4.9979e-05, 'epoch': 0.04}
05/18/2024 08:57:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.2658, 'learning_rate': 4.9978e-05, 'epoch': 0.04}
05/18/2024 08:57:40 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-500
05/18/2024 08:57:41 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 08:57:41 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 08:57:41 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-500/tokenizer_config.json
05/18/2024 08:57:41 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-500/special_tokens_map.json
05/18/2024 08:58:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.2300, 'learning_rate': 4.9978e-05, 'epoch': 0.04}
05/18/2024 08:59:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.2660, 'learning_rate': 4.9977e-05, 'epoch': 0.04}
05/18/2024 09:00:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2253, 'learning_rate': 4.9977e-05, 'epoch': 0.04}
05/18/2024 09:01:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2137, 'learning_rate': 4.9976e-05, 'epoch': 0.04}
05/18/2024 09:02:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.2829, 'learning_rate': 4.9976e-05, 'epoch': 0.04}
05/18/2024 09:03:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.2531, 'learning_rate': 4.9975e-05, 'epoch': 0.04}
05/18/2024 09:04:12 - INFO - llamafactory.extras.callbacks - {'loss': 1.2991, 'learning_rate': 4.9975e-05, 'epoch': 0.04}
05/18/2024 09:05:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.3188, 'learning_rate': 4.9974e-05, 'epoch': 0.04}
05/18/2024 09:06:04 - INFO - llamafactory.extras.callbacks - {'loss': 1.2815, 'learning_rate': 4.9974e-05, 'epoch': 0.04}
05/18/2024 09:07:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.2606, 'learning_rate': 4.9973e-05, 'epoch': 0.04}
05/18/2024 09:08:01 - INFO - llamafactory.extras.callbacks - {'loss': 1.2369, 'learning_rate': 4.9973e-05, 'epoch': 0.04}
05/18/2024 09:08:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.2351, 'learning_rate': 4.9972e-05, 'epoch': 0.04}
05/18/2024 09:09:56 - INFO - llamafactory.extras.callbacks - {'loss': 1.2210, 'learning_rate': 4.9972e-05, 'epoch': 0.05}
05/18/2024 09:10:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.1822, 'learning_rate': 4.9972e-05, 'epoch': 0.05}
05/18/2024 09:11:49 - INFO - llamafactory.extras.callbacks - {'loss': 1.2456, 'learning_rate': 4.9971e-05, 'epoch': 0.05}
05/18/2024 09:12:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.2186, 'learning_rate': 4.9970e-05, 'epoch': 0.05}
05/18/2024 09:13:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.2369, 'learning_rate': 4.9970e-05, 'epoch': 0.05}
05/18/2024 09:14:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.2597, 'learning_rate': 4.9969e-05, 'epoch': 0.05}
05/18/2024 09:15:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.1965, 'learning_rate': 4.9969e-05, 'epoch': 0.05}
05/18/2024 09:16:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2345, 'learning_rate': 4.9968e-05, 'epoch': 0.05}
05/18/2024 09:16:31 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-600
05/18/2024 09:16:31 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 09:16:31 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 09:16:31 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-600/tokenizer_config.json
05/18/2024 09:16:31 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-600/special_tokens_map.json
05/18/2024 09:17:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.2480, 'learning_rate': 4.9968e-05, 'epoch': 0.05}
05/18/2024 09:18:23 - INFO - llamafactory.extras.callbacks - {'loss': 1.2504, 'learning_rate': 4.9967e-05, 'epoch': 0.05}
05/18/2024 09:19:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.3122, 'learning_rate': 4.9967e-05, 'epoch': 0.05}
05/18/2024 09:20:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.2508, 'learning_rate': 4.9966e-05, 'epoch': 0.05}
05/18/2024 09:21:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.2145, 'learning_rate': 4.9966e-05, 'epoch': 0.05}
05/18/2024 09:22:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.2779, 'learning_rate': 4.9965e-05, 'epoch': 0.05}
05/18/2024 09:23:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.2722, 'learning_rate': 4.9965e-05, 'epoch': 0.05}
05/18/2024 09:24:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.2182, 'learning_rate': 4.9964e-05, 'epoch': 0.05}
05/18/2024 09:24:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.2368, 'learning_rate': 4.9964e-05, 'epoch': 0.05}
05/18/2024 09:25:56 - INFO - llamafactory.extras.callbacks - {'loss': 1.2447, 'learning_rate': 4.9963e-05, 'epoch': 0.05}
05/18/2024 09:26:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.1895, 'learning_rate': 4.9962e-05, 'epoch': 0.05}
05/18/2024 09:27:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.2276, 'learning_rate': 4.9962e-05, 'epoch': 0.05}
05/18/2024 09:28:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.2308, 'learning_rate': 4.9961e-05, 'epoch': 0.05}
05/18/2024 09:29:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.2044, 'learning_rate': 4.9961e-05, 'epoch': 0.05}
05/18/2024 09:30:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.2071, 'learning_rate': 4.9960e-05, 'epoch': 0.05}
05/18/2024 09:31:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.2218, 'learning_rate': 4.9959e-05, 'epoch': 0.05}
05/18/2024 09:32:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.2287, 'learning_rate': 4.9959e-05, 'epoch': 0.05}
05/18/2024 09:33:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.2153, 'learning_rate': 4.9958e-05, 'epoch': 0.06}
05/18/2024 09:34:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2849, 'learning_rate': 4.9958e-05, 'epoch': 0.06}
05/18/2024 09:35:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.1965, 'learning_rate': 4.9957e-05, 'epoch': 0.06}
05/18/2024 09:35:27 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-700
05/18/2024 09:35:28 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 09:35:28 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 09:35:28 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-700/tokenizer_config.json
05/18/2024 09:35:28 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-700/special_tokens_map.json
05/18/2024 09:36:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2489, 'learning_rate': 4.9956e-05, 'epoch': 0.06}
05/18/2024 09:37:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.1833, 'learning_rate': 4.9956e-05, 'epoch': 0.06}
05/18/2024 09:38:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.2699, 'learning_rate': 4.9955e-05, 'epoch': 0.06}
05/18/2024 09:39:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2582, 'learning_rate': 4.9955e-05, 'epoch': 0.06}
05/18/2024 09:40:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.2256, 'learning_rate': 4.9954e-05, 'epoch': 0.06}
05/18/2024 09:41:03 - INFO - llamafactory.extras.callbacks - {'loss': 1.1909, 'learning_rate': 4.9953e-05, 'epoch': 0.06}
05/18/2024 09:41:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.2023, 'learning_rate': 4.9953e-05, 'epoch': 0.06}
05/18/2024 09:42:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.2066, 'learning_rate': 4.9952e-05, 'epoch': 0.06}
05/18/2024 09:43:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.2038, 'learning_rate': 4.9951e-05, 'epoch': 0.06}
05/18/2024 09:44:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.2509, 'learning_rate': 4.9951e-05, 'epoch': 0.06}
05/18/2024 09:45:43 - INFO - llamafactory.extras.callbacks - {'loss': 1.2952, 'learning_rate': 4.9950e-05, 'epoch': 0.06}
05/18/2024 09:46:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.1556, 'learning_rate': 4.9949e-05, 'epoch': 0.06}
05/18/2024 09:47:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.1890, 'learning_rate': 4.9949e-05, 'epoch': 0.06}
05/18/2024 09:48:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.1876, 'learning_rate': 4.9948e-05, 'epoch': 0.06}
05/18/2024 09:49:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2369, 'learning_rate': 4.9947e-05, 'epoch': 0.06}
05/18/2024 09:50:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.2140, 'learning_rate': 4.9947e-05, 'epoch': 0.06}
05/18/2024 09:51:23 - INFO - llamafactory.extras.callbacks - {'loss': 1.2118, 'learning_rate': 4.9946e-05, 'epoch': 0.06}
05/18/2024 09:52:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.2230, 'learning_rate': 4.9945e-05, 'epoch': 0.06}
05/18/2024 09:53:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.1949, 'learning_rate': 4.9945e-05, 'epoch': 0.06}
05/18/2024 09:54:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.2275, 'learning_rate': 4.9944e-05, 'epoch': 0.06}
05/18/2024 09:54:17 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-800
05/18/2024 09:54:17 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 09:54:17 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 09:54:17 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-800/tokenizer_config.json
05/18/2024 09:54:17 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-800/special_tokens_map.json
05/18/2024 09:55:12 - INFO - llamafactory.extras.callbacks - {'loss': 1.1766, 'learning_rate': 4.9943e-05, 'epoch': 0.06}
05/18/2024 09:56:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.2238, 'learning_rate': 4.9942e-05, 'epoch': 0.06}
05/18/2024 09:57:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.2024, 'learning_rate': 4.9942e-05, 'epoch': 0.07}
05/18/2024 09:58:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.2610, 'learning_rate': 4.9941e-05, 'epoch': 0.07}
05/18/2024 09:59:04 - INFO - llamafactory.extras.callbacks - {'loss': 1.2248, 'learning_rate': 4.9940e-05, 'epoch': 0.07}
05/18/2024 09:59:59 - INFO - llamafactory.extras.callbacks - {'loss': 1.2160, 'learning_rate': 4.9940e-05, 'epoch': 0.07}
05/18/2024 10:00:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.1822, 'learning_rate': 4.9939e-05, 'epoch': 0.07}
05/18/2024 10:01:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.2780, 'learning_rate': 4.9938e-05, 'epoch': 0.07}
05/18/2024 10:02:49 - INFO - llamafactory.extras.callbacks - {'loss': 1.2005, 'learning_rate': 4.9937e-05, 'epoch': 0.07}
05/18/2024 10:03:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.1899, 'learning_rate': 4.9937e-05, 'epoch': 0.07}
05/18/2024 10:04:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.2190, 'learning_rate': 4.9936e-05, 'epoch': 0.07}
05/18/2024 10:05:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.2443, 'learning_rate': 4.9935e-05, 'epoch': 0.07}
05/18/2024 10:06:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.2262, 'learning_rate': 4.9934e-05, 'epoch': 0.07}
05/18/2024 10:07:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.2437, 'learning_rate': 4.9934e-05, 'epoch': 0.07}
05/18/2024 10:08:30 - INFO - llamafactory.extras.callbacks - {'loss': 1.2757, 'learning_rate': 4.9933e-05, 'epoch': 0.07}
05/18/2024 10:09:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.1969, 'learning_rate': 4.9932e-05, 'epoch': 0.07}
05/18/2024 10:10:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.2608, 'learning_rate': 4.9931e-05, 'epoch': 0.07}
05/18/2024 10:11:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2101, 'learning_rate': 4.9931e-05, 'epoch': 0.07}
05/18/2024 10:12:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.2782, 'learning_rate': 4.9930e-05, 'epoch': 0.07}
05/18/2024 10:13:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.1617, 'learning_rate': 4.9929e-05, 'epoch': 0.07}
05/18/2024 10:13:13 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-900
05/18/2024 10:13:14 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 10:13:14 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 10:13:14 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-900/tokenizer_config.json
05/18/2024 10:13:14 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-900/special_tokens_map.json
05/18/2024 10:14:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.2491, 'learning_rate': 4.9928e-05, 'epoch': 0.07}
05/18/2024 10:15:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.2323, 'learning_rate': 4.9927e-05, 'epoch': 0.07}
05/18/2024 10:16:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.2315, 'learning_rate': 4.9927e-05, 'epoch': 0.07}
05/18/2024 10:17:03 - INFO - llamafactory.extras.callbacks - {'loss': 1.2285, 'learning_rate': 4.9926e-05, 'epoch': 0.07}
05/18/2024 10:17:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.2123, 'learning_rate': 4.9925e-05, 'epoch': 0.07}
05/18/2024 10:18:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.2318, 'learning_rate': 4.9924e-05, 'epoch': 0.07}
05/18/2024 10:19:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.1746, 'learning_rate': 4.9923e-05, 'epoch': 0.07}
05/18/2024 10:20:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.2146, 'learning_rate': 4.9923e-05, 'epoch': 0.08}
05/18/2024 10:21:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.2061, 'learning_rate': 4.9922e-05, 'epoch': 0.08}
05/18/2024 10:22:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.1909, 'learning_rate': 4.9921e-05, 'epoch': 0.08}
05/18/2024 10:23:35 - INFO - llamafactory.extras.callbacks - {'loss': 1.2249, 'learning_rate': 4.9920e-05, 'epoch': 0.08}
05/18/2024 10:24:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.2154, 'learning_rate': 4.9919e-05, 'epoch': 0.08}
05/18/2024 10:25:30 - INFO - llamafactory.extras.callbacks - {'loss': 1.2488, 'learning_rate': 4.9918e-05, 'epoch': 0.08}
05/18/2024 10:26:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.1906, 'learning_rate': 4.9918e-05, 'epoch': 0.08}
05/18/2024 10:27:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2324, 'learning_rate': 4.9917e-05, 'epoch': 0.08}
05/18/2024 10:28:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2576, 'learning_rate': 4.9916e-05, 'epoch': 0.08}
05/18/2024 10:29:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.2846, 'learning_rate': 4.9915e-05, 'epoch': 0.08}
05/18/2024 10:30:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.2192, 'learning_rate': 4.9914e-05, 'epoch': 0.08}
05/18/2024 10:31:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2622, 'learning_rate': 4.9913e-05, 'epoch': 0.08}
05/18/2024 10:32:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.2302, 'learning_rate': 4.9912e-05, 'epoch': 0.08}
05/18/2024 10:32:06 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1000
05/18/2024 10:32:16 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1000/tokenizer_config.json
05/18/2024 10:32:16 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1000/special_tokens_map.json
05/18/2024 10:33:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.2561, 'learning_rate': 4.9911e-05, 'epoch': 0.08}
05/18/2024 10:34:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.2230, 'learning_rate': 4.9911e-05, 'epoch': 0.08}
05/18/2024 10:35:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.2002, 'learning_rate': 4.9910e-05, 'epoch': 0.08}
05/18/2024 10:36:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.2095, 'learning_rate': 4.9909e-05, 'epoch': 0.08}
05/18/2024 10:37:01 - INFO - llamafactory.extras.callbacks - {'loss': 1.2368, 'learning_rate': 4.9908e-05, 'epoch': 0.08}
05/18/2024 10:37:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.2379, 'learning_rate': 4.9907e-05, 'epoch': 0.08}
05/18/2024 10:38:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.1947, 'learning_rate': 4.9906e-05, 'epoch': 0.08}
05/18/2024 10:39:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.2111, 'learning_rate': 4.9905e-05, 'epoch': 0.08}
05/18/2024 10:40:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.2507, 'learning_rate': 4.9904e-05, 'epoch': 0.08}
05/18/2024 10:41:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.2333, 'learning_rate': 4.9903e-05, 'epoch': 0.08}
05/18/2024 10:42:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.2072, 'learning_rate': 4.9902e-05, 'epoch': 0.08}
05/18/2024 10:43:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.1987, 'learning_rate': 4.9901e-05, 'epoch': 0.08}
05/18/2024 10:44:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.2305, 'learning_rate': 4.9901e-05, 'epoch': 0.09}
05/18/2024 10:45:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.1646, 'learning_rate': 4.9900e-05, 'epoch': 0.09}
05/18/2024 10:46:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.1999, 'learning_rate': 4.9899e-05, 'epoch': 0.09}
05/18/2024 10:47:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.2169, 'learning_rate': 4.9898e-05, 'epoch': 0.09}
05/18/2024 10:48:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.2711, 'learning_rate': 4.9897e-05, 'epoch': 0.09}
05/18/2024 10:49:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.1496, 'learning_rate': 4.9896e-05, 'epoch': 0.09}
05/18/2024 10:50:24 - INFO - llamafactory.extras.callbacks - {'loss': 1.2445, 'learning_rate': 4.9895e-05, 'epoch': 0.09}
05/18/2024 10:51:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.1969, 'learning_rate': 4.9894e-05, 'epoch': 0.09}
05/18/2024 10:51:21 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1100
05/18/2024 10:51:31 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1100/tokenizer_config.json
05/18/2024 10:51:31 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1100/special_tokens_map.json
05/18/2024 10:52:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.1862, 'learning_rate': 4.9893e-05, 'epoch': 0.09}
05/18/2024 10:53:24 - INFO - llamafactory.extras.callbacks - {'loss': 1.2265, 'learning_rate': 4.9892e-05, 'epoch': 0.09}
05/18/2024 10:54:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.2077, 'learning_rate': 4.9891e-05, 'epoch': 0.09}
05/18/2024 10:55:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.1972, 'learning_rate': 4.9890e-05, 'epoch': 0.09}
05/18/2024 10:56:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.2433, 'learning_rate': 4.9889e-05, 'epoch': 0.09}
05/18/2024 10:57:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.2143, 'learning_rate': 4.9888e-05, 'epoch': 0.09}
05/18/2024 10:58:12 - INFO - llamafactory.extras.callbacks - {'loss': 1.2113, 'learning_rate': 4.9887e-05, 'epoch': 0.09}
05/18/2024 10:59:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.1745, 'learning_rate': 4.9886e-05, 'epoch': 0.09}
05/18/2024 11:00:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.2708, 'learning_rate': 4.9885e-05, 'epoch': 0.09}
05/18/2024 11:01:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.2398, 'learning_rate': 4.9884e-05, 'epoch': 0.09}
05/18/2024 11:01:59 - INFO - llamafactory.extras.callbacks - {'loss': 1.2493, 'learning_rate': 4.9883e-05, 'epoch': 0.09}
05/18/2024 11:02:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.2151, 'learning_rate': 4.9882e-05, 'epoch': 0.09}
05/18/2024 11:03:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.2635, 'learning_rate': 4.9881e-05, 'epoch': 0.09}
05/18/2024 11:04:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.2130, 'learning_rate': 4.9880e-05, 'epoch': 0.09}
05/18/2024 11:05:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.2490, 'learning_rate': 4.9879e-05, 'epoch': 0.09}
05/18/2024 11:06:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.2450, 'learning_rate': 4.9878e-05, 'epoch': 0.09}
05/18/2024 11:07:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.1596, 'learning_rate': 4.9877e-05, 'epoch': 0.09}
05/18/2024 11:08:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.1932, 'learning_rate': 4.9876e-05, 'epoch': 0.10}
05/18/2024 11:09:35 - INFO - llamafactory.extras.callbacks - {'loss': 1.1660, 'learning_rate': 4.9875e-05, 'epoch': 0.10}
05/18/2024 11:10:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.2109, 'learning_rate': 4.9874e-05, 'epoch': 0.10}
05/18/2024 11:10:34 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1200
05/18/2024 11:10:44 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1200/tokenizer_config.json
05/18/2024 11:10:44 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1200/special_tokens_map.json
05/18/2024 11:11:43 - INFO - llamafactory.extras.callbacks - {'loss': 1.2005, 'learning_rate': 4.9873e-05, 'epoch': 0.10}
05/18/2024 11:12:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.2178, 'learning_rate': 4.9872e-05, 'epoch': 0.10}
05/18/2024 11:13:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.1837, 'learning_rate': 4.9871e-05, 'epoch': 0.10}
05/18/2024 11:14:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.2051, 'learning_rate': 4.9870e-05, 'epoch': 0.10}
05/18/2024 11:15:30 - INFO - llamafactory.extras.callbacks - {'loss': 1.1946, 'learning_rate': 4.9868e-05, 'epoch': 0.10}
05/18/2024 11:16:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.2150, 'learning_rate': 4.9867e-05, 'epoch': 0.10}
05/18/2024 11:17:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2124, 'learning_rate': 4.9866e-05, 'epoch': 0.10}
05/18/2024 11:18:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2048, 'learning_rate': 4.9865e-05, 'epoch': 0.10}
05/18/2024 11:19:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.1847, 'learning_rate': 4.9864e-05, 'epoch': 0.10}
05/18/2024 11:20:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.2010, 'learning_rate': 4.9863e-05, 'epoch': 0.10}
05/18/2024 11:21:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2208, 'learning_rate': 4.9862e-05, 'epoch': 0.10}
05/18/2024 11:22:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.3027, 'learning_rate': 4.9861e-05, 'epoch': 0.10}
05/18/2024 11:23:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.2100, 'learning_rate': 4.9860e-05, 'epoch': 0.10}
05/18/2024 11:23:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.2278, 'learning_rate': 4.9859e-05, 'epoch': 0.10}
05/18/2024 11:24:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.2594, 'learning_rate': 4.9858e-05, 'epoch': 0.10}
05/18/2024 11:25:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.1852, 'learning_rate': 4.9856e-05, 'epoch': 0.10}
05/18/2024 11:26:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.2257, 'learning_rate': 4.9855e-05, 'epoch': 0.10}
05/18/2024 11:27:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.1816, 'learning_rate': 4.9854e-05, 'epoch': 0.10}
05/18/2024 11:28:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.2328, 'learning_rate': 4.9853e-05, 'epoch': 0.10}
05/18/2024 11:29:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.3235, 'learning_rate': 4.9852e-05, 'epoch': 0.10}
05/18/2024 11:29:36 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1300
05/18/2024 11:29:47 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1300/tokenizer_config.json
05/18/2024 11:29:47 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1300/special_tokens_map.json
05/18/2024 11:30:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.2024, 'learning_rate': 4.9851e-05, 'epoch': 0.10}
05/18/2024 11:31:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.2577, 'learning_rate': 4.9850e-05, 'epoch': 0.10}
05/18/2024 11:32:39 - INFO - llamafactory.extras.callbacks - {'loss': 1.1717, 'learning_rate': 4.9848e-05, 'epoch': 0.11}
05/18/2024 11:33:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.2002, 'learning_rate': 4.9847e-05, 'epoch': 0.11}
05/18/2024 11:34:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2273, 'learning_rate': 4.9846e-05, 'epoch': 0.11}
05/18/2024 11:35:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.1932, 'learning_rate': 4.9845e-05, 'epoch': 0.11}
05/18/2024 11:36:23 - INFO - llamafactory.extras.callbacks - {'loss': 1.2070, 'learning_rate': 4.9844e-05, 'epoch': 0.11}
05/18/2024 11:37:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.1626, 'learning_rate': 4.9843e-05, 'epoch': 0.11}
05/18/2024 11:38:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.2294, 'learning_rate': 4.9841e-05, 'epoch': 0.11}
05/18/2024 11:39:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.2109, 'learning_rate': 4.9840e-05, 'epoch': 0.11}
05/18/2024 11:40:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2104, 'learning_rate': 4.9839e-05, 'epoch': 0.11}
05/18/2024 11:41:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.1762, 'learning_rate': 4.9838e-05, 'epoch': 0.11}
05/18/2024 11:42:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.2136, 'learning_rate': 4.9837e-05, 'epoch': 0.11}
05/18/2024 11:43:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.2212, 'learning_rate': 4.9836e-05, 'epoch': 0.11}
05/18/2024 11:43:56 - INFO - llamafactory.extras.callbacks - {'loss': 1.1812, 'learning_rate': 4.9834e-05, 'epoch': 0.11}
05/18/2024 11:44:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.2175, 'learning_rate': 4.9833e-05, 'epoch': 0.11}
05/18/2024 11:45:49 - INFO - llamafactory.extras.callbacks - {'loss': 1.2099, 'learning_rate': 4.9832e-05, 'epoch': 0.11}
05/18/2024 11:46:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.2313, 'learning_rate': 4.9831e-05, 'epoch': 0.11}
05/18/2024 11:47:39 - INFO - llamafactory.extras.callbacks - {'loss': 1.2507, 'learning_rate': 4.9829e-05, 'epoch': 0.11}
05/18/2024 11:48:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.1767, 'learning_rate': 4.9828e-05, 'epoch': 0.11}
05/18/2024 11:48:36 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1400
05/18/2024 11:48:37 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 11:48:37 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 11:48:37 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1400/tokenizer_config.json
05/18/2024 11:48:37 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1400/special_tokens_map.json
05/18/2024 11:49:35 - INFO - llamafactory.extras.callbacks - {'loss': 1.2055, 'learning_rate': 4.9827e-05, 'epoch': 0.11}
05/18/2024 11:50:30 - INFO - llamafactory.extras.callbacks - {'loss': 1.1826, 'learning_rate': 4.9826e-05, 'epoch': 0.11}
05/18/2024 11:51:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.2034, 'learning_rate': 4.9825e-05, 'epoch': 0.11}
05/18/2024 11:52:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.1955, 'learning_rate': 4.9823e-05, 'epoch': 0.11}
05/18/2024 11:53:24 - INFO - llamafactory.extras.callbacks - {'loss': 1.2228, 'learning_rate': 4.9822e-05, 'epoch': 0.11}
05/18/2024 11:54:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.1827, 'learning_rate': 4.9821e-05, 'epoch': 0.11}
05/18/2024 11:55:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.2226, 'learning_rate': 4.9820e-05, 'epoch': 0.11}
05/18/2024 11:56:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.2594, 'learning_rate': 4.9818e-05, 'epoch': 0.12}
05/18/2024 11:57:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.2773, 'learning_rate': 4.9817e-05, 'epoch': 0.12}
05/18/2024 11:58:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.1894, 'learning_rate': 4.9816e-05, 'epoch': 0.12}
05/18/2024 11:59:03 - INFO - llamafactory.extras.callbacks - {'loss': 1.2409, 'learning_rate': 4.9815e-05, 'epoch': 0.12}
05/18/2024 11:59:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.2251, 'learning_rate': 4.9813e-05, 'epoch': 0.12}
05/18/2024 12:00:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.1519, 'learning_rate': 4.9812e-05, 'epoch': 0.12}
05/18/2024 12:01:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.2432, 'learning_rate': 4.9811e-05, 'epoch': 0.12}
05/18/2024 12:02:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.1641, 'learning_rate': 4.9809e-05, 'epoch': 0.12}
05/18/2024 12:03:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.2464, 'learning_rate': 4.9808e-05, 'epoch': 0.12}
05/18/2024 12:04:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.2007, 'learning_rate': 4.9807e-05, 'epoch': 0.12}
05/18/2024 12:05:32 - INFO - llamafactory.extras.callbacks - {'loss': 1.1925, 'learning_rate': 4.9805e-05, 'epoch': 0.12}
05/18/2024 12:06:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.1775, 'learning_rate': 4.9804e-05, 'epoch': 0.12}
05/18/2024 12:07:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2813, 'learning_rate': 4.9803e-05, 'epoch': 0.12}
05/18/2024 12:07:25 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1500
05/18/2024 12:07:25 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 12:07:25 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 12:07:25 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1500/tokenizer_config.json
05/18/2024 12:07:25 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1500/special_tokens_map.json
05/18/2024 12:08:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2213, 'learning_rate': 4.9802e-05, 'epoch': 0.12}
05/18/2024 12:09:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.1459, 'learning_rate': 4.9800e-05, 'epoch': 0.12}
05/18/2024 12:10:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.1575, 'learning_rate': 4.9799e-05, 'epoch': 0.12}
05/18/2024 12:11:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.1516, 'learning_rate': 4.9798e-05, 'epoch': 0.12}
05/18/2024 12:12:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.2135, 'learning_rate': 4.9796e-05, 'epoch': 0.12}
05/18/2024 12:13:04 - INFO - llamafactory.extras.callbacks - {'loss': 1.2702, 'learning_rate': 4.9795e-05, 'epoch': 0.12}
05/18/2024 12:14:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.2653, 'learning_rate': 4.9794e-05, 'epoch': 0.12}
05/18/2024 12:14:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.2467, 'learning_rate': 4.9792e-05, 'epoch': 0.12}
05/18/2024 12:15:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.2177, 'learning_rate': 4.9791e-05, 'epoch': 0.12}
05/18/2024 12:16:49 - INFO - llamafactory.extras.callbacks - {'loss': 1.1528, 'learning_rate': 4.9790e-05, 'epoch': 0.12}
05/18/2024 12:17:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.1925, 'learning_rate': 4.9788e-05, 'epoch': 0.12}
05/18/2024 12:18:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.1702, 'learning_rate': 4.9787e-05, 'epoch': 0.12}
05/18/2024 12:19:32 - INFO - llamafactory.extras.callbacks - {'loss': 1.2394, 'learning_rate': 4.9785e-05, 'epoch': 0.13}
05/18/2024 12:20:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2256, 'learning_rate': 4.9784e-05, 'epoch': 0.13}
05/18/2024 12:21:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.2859, 'learning_rate': 4.9783e-05, 'epoch': 0.13}
05/18/2024 12:22:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.2486, 'learning_rate': 4.9781e-05, 'epoch': 0.13}
05/18/2024 12:23:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.1880, 'learning_rate': 4.9780e-05, 'epoch': 0.13}
05/18/2024 12:24:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.1579, 'learning_rate': 4.9779e-05, 'epoch': 0.13}
05/18/2024 12:25:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.2631, 'learning_rate': 4.9777e-05, 'epoch': 0.13}
05/18/2024 12:26:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.1440, 'learning_rate': 4.9776e-05, 'epoch': 0.13}
05/18/2024 12:26:11 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1600
05/18/2024 12:26:11 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 12:26:11 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 12:26:11 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1600/tokenizer_config.json
05/18/2024 12:26:11 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1600/special_tokens_map.json
05/18/2024 12:27:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.2471, 'learning_rate': 4.9774e-05, 'epoch': 0.13}
05/18/2024 12:28:04 - INFO - llamafactory.extras.callbacks - {'loss': 1.2220, 'learning_rate': 4.9773e-05, 'epoch': 0.13}
05/18/2024 12:29:01 - INFO - llamafactory.extras.callbacks - {'loss': 1.1857, 'learning_rate': 4.9772e-05, 'epoch': 0.13}
05/18/2024 12:29:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.1973, 'learning_rate': 4.9770e-05, 'epoch': 0.13}
05/18/2024 12:30:56 - INFO - llamafactory.extras.callbacks - {'loss': 1.1559, 'learning_rate': 4.9769e-05, 'epoch': 0.13}
05/18/2024 12:31:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.2560, 'learning_rate': 4.9767e-05, 'epoch': 0.13}
05/18/2024 12:32:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.1486, 'learning_rate': 4.9766e-05, 'epoch': 0.13}
05/18/2024 12:33:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.2871, 'learning_rate': 4.9764e-05, 'epoch': 0.13}
05/18/2024 12:34:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.1780, 'learning_rate': 4.9763e-05, 'epoch': 0.13}
05/18/2024 12:35:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.2556, 'learning_rate': 4.9762e-05, 'epoch': 0.13}
05/18/2024 12:36:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.1758, 'learning_rate': 4.9760e-05, 'epoch': 0.13}
05/18/2024 12:37:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.2246, 'learning_rate': 4.9759e-05, 'epoch': 0.13}
05/18/2024 12:38:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.2752, 'learning_rate': 4.9757e-05, 'epoch': 0.13}
05/18/2024 12:39:32 - INFO - llamafactory.extras.callbacks - {'loss': 1.1814, 'learning_rate': 4.9756e-05, 'epoch': 0.13}
05/18/2024 12:40:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.2070, 'learning_rate': 4.9754e-05, 'epoch': 0.13}
05/18/2024 12:41:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2219, 'learning_rate': 4.9753e-05, 'epoch': 0.13}
05/18/2024 12:42:23 - INFO - llamafactory.extras.callbacks - {'loss': 1.2013, 'learning_rate': 4.9751e-05, 'epoch': 0.13}
05/18/2024 12:43:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.1784, 'learning_rate': 4.9750e-05, 'epoch': 0.14}
05/18/2024 12:44:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.2023, 'learning_rate': 4.9748e-05, 'epoch': 0.14}
05/18/2024 12:45:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.2658, 'learning_rate': 4.9747e-05, 'epoch': 0.14}
05/18/2024 12:45:10 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1700
05/18/2024 12:45:11 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 12:45:11 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 12:45:11 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1700/tokenizer_config.json
05/18/2024 12:45:11 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1700/special_tokens_map.json
05/18/2024 12:46:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.2172, 'learning_rate': 4.9745e-05, 'epoch': 0.14}
05/18/2024 12:47:03 - INFO - llamafactory.extras.callbacks - {'loss': 1.2236, 'learning_rate': 4.9744e-05, 'epoch': 0.14}
05/18/2024 12:48:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.1822, 'learning_rate': 4.9742e-05, 'epoch': 0.14}
05/18/2024 12:48:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.2386, 'learning_rate': 4.9741e-05, 'epoch': 0.14}
05/18/2024 12:49:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.1664, 'learning_rate': 4.9739e-05, 'epoch': 0.14}
05/18/2024 12:50:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.1526, 'learning_rate': 4.9738e-05, 'epoch': 0.14}
05/18/2024 12:51:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.1802, 'learning_rate': 4.9736e-05, 'epoch': 0.14}
05/18/2024 12:52:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.2272, 'learning_rate': 4.9735e-05, 'epoch': 0.14}
05/18/2024 12:53:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.3008, 'learning_rate': 4.9733e-05, 'epoch': 0.14}
05/18/2024 12:54:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.2122, 'learning_rate': 4.9732e-05, 'epoch': 0.14}
05/18/2024 12:55:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.2475, 'learning_rate': 4.9730e-05, 'epoch': 0.14}
05/18/2024 12:56:30 - INFO - llamafactory.extras.callbacks - {'loss': 1.1491, 'learning_rate': 4.9729e-05, 'epoch': 0.14}
05/18/2024 12:57:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.2638, 'learning_rate': 4.9727e-05, 'epoch': 0.14}
05/18/2024 12:58:23 - INFO - llamafactory.extras.callbacks - {'loss': 1.2414, 'learning_rate': 4.9726e-05, 'epoch': 0.14}
05/18/2024 12:59:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.1540, 'learning_rate': 4.9724e-05, 'epoch': 0.14}
05/18/2024 13:00:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.2212, 'learning_rate': 4.9723e-05, 'epoch': 0.14}
05/18/2024 13:01:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2248, 'learning_rate': 4.9721e-05, 'epoch': 0.14}
05/18/2024 13:02:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.0836, 'learning_rate': 4.9719e-05, 'epoch': 0.14}
05/18/2024 13:03:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.2183, 'learning_rate': 4.9718e-05, 'epoch': 0.14}
05/18/2024 13:04:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.2099, 'learning_rate': 4.9716e-05, 'epoch': 0.14}
05/18/2024 13:04:02 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1800
05/18/2024 13:04:03 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 13:04:03 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 13:04:03 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1800/tokenizer_config.json
05/18/2024 13:04:03 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1800/special_tokens_map.json
05/18/2024 13:04:59 - INFO - llamafactory.extras.callbacks - {'loss': 1.1769, 'learning_rate': 4.9715e-05, 'epoch': 0.14}
05/18/2024 13:05:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.2173, 'learning_rate': 4.9713e-05, 'epoch': 0.14}
05/18/2024 13:06:49 - INFO - llamafactory.extras.callbacks - {'loss': 1.1878, 'learning_rate': 4.9712e-05, 'epoch': 0.15}
05/18/2024 13:07:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.1759, 'learning_rate': 4.9710e-05, 'epoch': 0.15}
05/18/2024 13:08:43 - INFO - llamafactory.extras.callbacks - {'loss': 1.1912, 'learning_rate': 4.9708e-05, 'epoch': 0.15}
05/18/2024 13:09:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.2538, 'learning_rate': 4.9707e-05, 'epoch': 0.15}
05/18/2024 13:10:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.1707, 'learning_rate': 4.9705e-05, 'epoch': 0.15}
05/18/2024 13:11:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.0796, 'learning_rate': 4.9704e-05, 'epoch': 0.15}
05/18/2024 13:12:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2042, 'learning_rate': 4.9702e-05, 'epoch': 0.15}
05/18/2024 13:13:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.2424, 'learning_rate': 4.9700e-05, 'epoch': 0.15}
05/18/2024 13:14:24 - INFO - llamafactory.extras.callbacks - {'loss': 1.2442, 'learning_rate': 4.9699e-05, 'epoch': 0.15}
05/18/2024 13:15:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.2286, 'learning_rate': 4.9697e-05, 'epoch': 0.15}
05/18/2024 13:16:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.2409, 'learning_rate': 4.9695e-05, 'epoch': 0.15}
05/18/2024 13:17:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2491, 'learning_rate': 4.9694e-05, 'epoch': 0.15}
05/18/2024 13:18:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.1515, 'learning_rate': 4.9692e-05, 'epoch': 0.15}
05/18/2024 13:19:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.2829, 'learning_rate': 4.9691e-05, 'epoch': 0.15}
05/18/2024 13:20:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.2083, 'learning_rate': 4.9689e-05, 'epoch': 0.15}
05/18/2024 13:20:59 - INFO - llamafactory.extras.callbacks - {'loss': 1.1898, 'learning_rate': 4.9687e-05, 'epoch': 0.15}
05/18/2024 13:21:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.2359, 'learning_rate': 4.9686e-05, 'epoch': 0.15}
05/18/2024 13:22:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.2036, 'learning_rate': 4.9684e-05, 'epoch': 0.15}
05/18/2024 13:22:52 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1900
05/18/2024 13:22:53 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 13:22:53 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 13:22:53 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1900/tokenizer_config.json
05/18/2024 13:22:53 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-1900/special_tokens_map.json
05/18/2024 13:23:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.2044, 'learning_rate': 4.9682e-05, 'epoch': 0.15}
05/18/2024 13:24:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.3014, 'learning_rate': 4.9681e-05, 'epoch': 0.15}
05/18/2024 13:25:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.2249, 'learning_rate': 4.9679e-05, 'epoch': 0.15}
05/18/2024 13:26:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.2519, 'learning_rate': 4.9677e-05, 'epoch': 0.15}
05/18/2024 13:27:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.2515, 'learning_rate': 4.9676e-05, 'epoch': 0.15}
05/18/2024 13:28:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.1648, 'learning_rate': 4.9674e-05, 'epoch': 0.15}
05/18/2024 13:29:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.1668, 'learning_rate': 4.9672e-05, 'epoch': 0.15}
05/18/2024 13:30:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2718, 'learning_rate': 4.9671e-05, 'epoch': 0.16}
05/18/2024 13:31:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.2150, 'learning_rate': 4.9669e-05, 'epoch': 0.16}
05/18/2024 13:32:24 - INFO - llamafactory.extras.callbacks - {'loss': 1.2576, 'learning_rate': 4.9667e-05, 'epoch': 0.16}
05/18/2024 13:33:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.1778, 'learning_rate': 4.9665e-05, 'epoch': 0.16}
05/18/2024 13:34:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.1827, 'learning_rate': 4.9664e-05, 'epoch': 0.16}
05/18/2024 13:35:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.1600, 'learning_rate': 4.9662e-05, 'epoch': 0.16}
05/18/2024 13:36:12 - INFO - llamafactory.extras.callbacks - {'loss': 1.2164, 'learning_rate': 4.9660e-05, 'epoch': 0.16}
05/18/2024 13:37:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2729, 'learning_rate': 4.9659e-05, 'epoch': 0.16}
05/18/2024 13:38:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.2141, 'learning_rate': 4.9657e-05, 'epoch': 0.16}
05/18/2024 13:39:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.2568, 'learning_rate': 4.9655e-05, 'epoch': 0.16}
05/18/2024 13:40:04 - INFO - llamafactory.extras.callbacks - {'loss': 1.2167, 'learning_rate': 4.9653e-05, 'epoch': 0.16}
05/18/2024 13:40:59 - INFO - llamafactory.extras.callbacks - {'loss': 1.1780, 'learning_rate': 4.9652e-05, 'epoch': 0.16}
05/18/2024 13:42:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.2442, 'learning_rate': 4.9650e-05, 'epoch': 0.16}
05/18/2024 13:42:00 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2000
05/18/2024 13:42:00 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 13:42:00 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 13:42:00 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2000/tokenizer_config.json
05/18/2024 13:42:00 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2000/special_tokens_map.json
05/18/2024 13:42:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.1763, 'learning_rate': 4.9648e-05, 'epoch': 0.16}
05/18/2024 13:43:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.2057, 'learning_rate': 4.9646e-05, 'epoch': 0.16}
05/18/2024 13:44:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.2072, 'learning_rate': 4.9645e-05, 'epoch': 0.16}
05/18/2024 13:45:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.1904, 'learning_rate': 4.9643e-05, 'epoch': 0.16}
05/18/2024 13:46:43 - INFO - llamafactory.extras.callbacks - {'loss': 1.2550, 'learning_rate': 4.9641e-05, 'epoch': 0.16}
05/18/2024 13:47:39 - INFO - llamafactory.extras.callbacks - {'loss': 1.2245, 'learning_rate': 4.9639e-05, 'epoch': 0.16}
05/18/2024 13:48:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.2019, 'learning_rate': 4.9638e-05, 'epoch': 0.16}
05/18/2024 13:49:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.1895, 'learning_rate': 4.9636e-05, 'epoch': 0.16}
05/18/2024 13:50:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.1823, 'learning_rate': 4.9634e-05, 'epoch': 0.16}
05/18/2024 13:51:24 - INFO - llamafactory.extras.callbacks - {'loss': 1.2057, 'learning_rate': 4.9632e-05, 'epoch': 0.16}
05/18/2024 13:52:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.2055, 'learning_rate': 4.9630e-05, 'epoch': 0.16}
05/18/2024 13:53:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.3009, 'learning_rate': 4.9629e-05, 'epoch': 0.16}
05/18/2024 13:54:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.1779, 'learning_rate': 4.9627e-05, 'epoch': 0.17}
05/18/2024 13:55:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.2536, 'learning_rate': 4.9625e-05, 'epoch': 0.17}
05/18/2024 13:56:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.2126, 'learning_rate': 4.9623e-05, 'epoch': 0.17}
05/18/2024 13:57:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.1766, 'learning_rate': 4.9621e-05, 'epoch': 0.17}
05/18/2024 13:57:59 - INFO - llamafactory.extras.callbacks - {'loss': 1.2302, 'learning_rate': 4.9620e-05, 'epoch': 0.17}
05/18/2024 13:58:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.1640, 'learning_rate': 4.9618e-05, 'epoch': 0.17}
05/18/2024 13:59:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.2141, 'learning_rate': 4.9616e-05, 'epoch': 0.17}
05/18/2024 14:00:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.2091, 'learning_rate': 4.9614e-05, 'epoch': 0.17}
05/18/2024 14:00:48 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2100
05/18/2024 14:00:48 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 14:00:48 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 14:00:48 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2100/tokenizer_config.json
05/18/2024 14:00:48 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2100/special_tokens_map.json
05/18/2024 14:01:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.1867, 'learning_rate': 4.9612e-05, 'epoch': 0.17}
05/18/2024 14:02:43 - INFO - llamafactory.extras.callbacks - {'loss': 1.1937, 'learning_rate': 4.9610e-05, 'epoch': 0.17}
05/18/2024 14:03:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.1849, 'learning_rate': 4.9609e-05, 'epoch': 0.17}
05/18/2024 14:04:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.2358, 'learning_rate': 4.9607e-05, 'epoch': 0.17}
05/18/2024 14:05:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.1965, 'learning_rate': 4.9605e-05, 'epoch': 0.17}
05/18/2024 14:06:30 - INFO - llamafactory.extras.callbacks - {'loss': 1.2007, 'learning_rate': 4.9603e-05, 'epoch': 0.17}
05/18/2024 14:07:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.1738, 'learning_rate': 4.9601e-05, 'epoch': 0.17}
05/18/2024 14:08:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2474, 'learning_rate': 4.9599e-05, 'epoch': 0.17}
05/18/2024 14:09:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.1534, 'learning_rate': 4.9597e-05, 'epoch': 0.17}
05/18/2024 14:10:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.1943, 'learning_rate': 4.9596e-05, 'epoch': 0.17}
05/18/2024 14:11:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.1962, 'learning_rate': 4.9594e-05, 'epoch': 0.17}
05/18/2024 14:12:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.2023, 'learning_rate': 4.9592e-05, 'epoch': 0.17}
05/18/2024 14:13:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.2129, 'learning_rate': 4.9590e-05, 'epoch': 0.17}
05/18/2024 14:14:01 - INFO - llamafactory.extras.callbacks - {'loss': 1.1964, 'learning_rate': 4.9588e-05, 'epoch': 0.17}
05/18/2024 14:14:59 - INFO - llamafactory.extras.callbacks - {'loss': 1.2270, 'learning_rate': 4.9586e-05, 'epoch': 0.17}
05/18/2024 14:15:56 - INFO - llamafactory.extras.callbacks - {'loss': 1.1662, 'learning_rate': 4.9584e-05, 'epoch': 0.17}
05/18/2024 14:16:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.2102, 'learning_rate': 4.9582e-05, 'epoch': 0.17}
05/18/2024 14:17:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.1915, 'learning_rate': 4.9580e-05, 'epoch': 0.18}
05/18/2024 14:18:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.1737, 'learning_rate': 4.9579e-05, 'epoch': 0.18}
05/18/2024 14:19:39 - INFO - llamafactory.extras.callbacks - {'loss': 1.2165, 'learning_rate': 4.9577e-05, 'epoch': 0.18}
05/18/2024 14:19:39 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2200
05/18/2024 14:19:40 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 14:19:40 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 14:19:40 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2200/tokenizer_config.json
05/18/2024 14:19:40 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2200/special_tokens_map.json
05/18/2024 14:20:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.2375, 'learning_rate': 4.9575e-05, 'epoch': 0.18}
05/18/2024 14:21:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2078, 'learning_rate': 4.9573e-05, 'epoch': 0.18}
05/18/2024 14:22:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.1476, 'learning_rate': 4.9571e-05, 'epoch': 0.18}
05/18/2024 14:23:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2646, 'learning_rate': 4.9569e-05, 'epoch': 0.18}
05/18/2024 14:24:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.1947, 'learning_rate': 4.9567e-05, 'epoch': 0.18}
05/18/2024 14:25:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.1819, 'learning_rate': 4.9565e-05, 'epoch': 0.18}
05/18/2024 14:26:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.2020, 'learning_rate': 4.9563e-05, 'epoch': 0.18}
05/18/2024 14:27:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2488, 'learning_rate': 4.9561e-05, 'epoch': 0.18}
05/18/2024 14:28:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.1680, 'learning_rate': 4.9559e-05, 'epoch': 0.18}
05/18/2024 14:29:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.2207, 'learning_rate': 4.9557e-05, 'epoch': 0.18}
05/18/2024 14:30:01 - INFO - llamafactory.extras.callbacks - {'loss': 1.2106, 'learning_rate': 4.9555e-05, 'epoch': 0.18}
05/18/2024 14:30:59 - INFO - llamafactory.extras.callbacks - {'loss': 1.1760, 'learning_rate': 4.9553e-05, 'epoch': 0.18}
05/18/2024 14:31:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.1974, 'learning_rate': 4.9551e-05, 'epoch': 0.18}
05/18/2024 14:32:49 - INFO - llamafactory.extras.callbacks - {'loss': 1.2318, 'learning_rate': 4.9549e-05, 'epoch': 0.18}
05/18/2024 14:33:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.1501, 'learning_rate': 4.9547e-05, 'epoch': 0.18}
05/18/2024 14:34:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.1612, 'learning_rate': 4.9545e-05, 'epoch': 0.18}
05/18/2024 14:35:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.2153, 'learning_rate': 4.9543e-05, 'epoch': 0.18}
05/18/2024 14:36:43 - INFO - llamafactory.extras.callbacks - {'loss': 1.2303, 'learning_rate': 4.9541e-05, 'epoch': 0.18}
05/18/2024 14:37:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.1946, 'learning_rate': 4.9539e-05, 'epoch': 0.18}
05/18/2024 14:38:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.1768, 'learning_rate': 4.9537e-05, 'epoch': 0.18}
05/18/2024 14:38:33 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2300
05/18/2024 14:38:34 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 14:38:34 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 14:38:34 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2300/tokenizer_config.json
05/18/2024 14:38:34 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2300/special_tokens_map.json
05/18/2024 14:39:30 - INFO - llamafactory.extras.callbacks - {'loss': 1.1911, 'learning_rate': 4.9535e-05, 'epoch': 0.18}
05/18/2024 14:40:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.1878, 'learning_rate': 4.9533e-05, 'epoch': 0.18}
05/18/2024 14:41:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2607, 'learning_rate': 4.9531e-05, 'epoch': 0.19}
05/18/2024 14:42:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.1796, 'learning_rate': 4.9529e-05, 'epoch': 0.19}
05/18/2024 14:43:12 - INFO - llamafactory.extras.callbacks - {'loss': 1.2501, 'learning_rate': 4.9527e-05, 'epoch': 0.19}
05/18/2024 14:44:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.2109, 'learning_rate': 4.9525e-05, 'epoch': 0.19}
05/18/2024 14:45:01 - INFO - llamafactory.extras.callbacks - {'loss': 1.1820, 'learning_rate': 4.9523e-05, 'epoch': 0.19}
05/18/2024 14:45:56 - INFO - llamafactory.extras.callbacks - {'loss': 1.1440, 'learning_rate': 4.9521e-05, 'epoch': 0.19}
05/18/2024 14:46:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.2085, 'learning_rate': 4.9519e-05, 'epoch': 0.19}
05/18/2024 14:47:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.2282, 'learning_rate': 4.9517e-05, 'epoch': 0.19}
05/18/2024 14:48:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.2097, 'learning_rate': 4.9515e-05, 'epoch': 0.19}
05/18/2024 14:49:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.1946, 'learning_rate': 4.9513e-05, 'epoch': 0.19}
05/18/2024 14:50:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.1917, 'learning_rate': 4.9511e-05, 'epoch': 0.19}
05/18/2024 14:51:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.2794, 'learning_rate': 4.9509e-05, 'epoch': 0.19}
05/18/2024 14:52:39 - INFO - llamafactory.extras.callbacks - {'loss': 1.2223, 'learning_rate': 4.9507e-05, 'epoch': 0.19}
05/18/2024 14:53:35 - INFO - llamafactory.extras.callbacks - {'loss': 1.2508, 'learning_rate': 4.9505e-05, 'epoch': 0.19}
05/18/2024 14:54:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2497, 'learning_rate': 4.9503e-05, 'epoch': 0.19}
05/18/2024 14:55:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.2056, 'learning_rate': 4.9501e-05, 'epoch': 0.19}
05/18/2024 14:56:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.1822, 'learning_rate': 4.9498e-05, 'epoch': 0.19}
05/18/2024 14:57:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.2102, 'learning_rate': 4.9496e-05, 'epoch': 0.19}
05/18/2024 14:57:21 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2400
05/18/2024 14:57:22 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 14:57:22 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 14:57:22 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2400/tokenizer_config.json
05/18/2024 14:57:22 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2400/special_tokens_map.json
05/18/2024 14:58:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.2044, 'learning_rate': 4.9494e-05, 'epoch': 0.19}
05/18/2024 14:59:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.1794, 'learning_rate': 4.9492e-05, 'epoch': 0.19}
05/18/2024 15:00:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.1904, 'learning_rate': 4.9490e-05, 'epoch': 0.19}
05/18/2024 15:01:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2205, 'learning_rate': 4.9488e-05, 'epoch': 0.19}
05/18/2024 15:02:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.2324, 'learning_rate': 4.9486e-05, 'epoch': 0.19}
05/18/2024 15:03:03 - INFO - llamafactory.extras.callbacks - {'loss': 1.1700, 'learning_rate': 4.9484e-05, 'epoch': 0.19}
05/18/2024 15:03:59 - INFO - llamafactory.extras.callbacks - {'loss': 1.2305, 'learning_rate': 4.9482e-05, 'epoch': 0.19}
05/18/2024 15:04:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.2448, 'learning_rate': 4.9480e-05, 'epoch': 0.20}
05/18/2024 15:05:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.2214, 'learning_rate': 4.9477e-05, 'epoch': 0.20}
05/18/2024 15:06:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.2026, 'learning_rate': 4.9475e-05, 'epoch': 0.20}
05/18/2024 15:07:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.2314, 'learning_rate': 4.9473e-05, 'epoch': 0.20}
05/18/2024 15:08:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.1738, 'learning_rate': 4.9471e-05, 'epoch': 0.20}
05/18/2024 15:09:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.1960, 'learning_rate': 4.9469e-05, 'epoch': 0.20}
05/18/2024 15:10:35 - INFO - llamafactory.extras.callbacks - {'loss': 1.2136, 'learning_rate': 4.9467e-05, 'epoch': 0.20}
05/18/2024 15:11:32 - INFO - llamafactory.extras.callbacks - {'loss': 1.2426, 'learning_rate': 4.9465e-05, 'epoch': 0.20}
05/18/2024 15:12:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.1381, 'learning_rate': 4.9462e-05, 'epoch': 0.20}
05/18/2024 15:13:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.2316, 'learning_rate': 4.9460e-05, 'epoch': 0.20}
05/18/2024 15:14:23 - INFO - llamafactory.extras.callbacks - {'loss': 1.1682, 'learning_rate': 4.9458e-05, 'epoch': 0.20}
05/18/2024 15:15:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.1923, 'learning_rate': 4.9456e-05, 'epoch': 0.20}
05/18/2024 15:16:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.2123, 'learning_rate': 4.9454e-05, 'epoch': 0.20}
05/18/2024 15:16:15 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2500
05/18/2024 15:16:17 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 15:16:17 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 15:16:17 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2500/tokenizer_config.json
05/18/2024 15:16:17 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2500/special_tokens_map.json
05/18/2024 15:17:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.1574, 'learning_rate': 4.9452e-05, 'epoch': 0.20}
05/18/2024 15:18:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.2123, 'learning_rate': 4.9449e-05, 'epoch': 0.20}
05/18/2024 15:19:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.1493, 'learning_rate': 4.9447e-05, 'epoch': 0.20}
05/18/2024 15:20:04 - INFO - llamafactory.extras.callbacks - {'loss': 1.1729, 'learning_rate': 4.9445e-05, 'epoch': 0.20}
05/18/2024 15:21:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.2178, 'learning_rate': 4.9443e-05, 'epoch': 0.20}
05/18/2024 15:22:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.2411, 'learning_rate': 4.9441e-05, 'epoch': 0.20}
05/18/2024 15:22:56 - INFO - llamafactory.extras.callbacks - {'loss': 1.2217, 'learning_rate': 4.9438e-05, 'epoch': 0.20}
05/18/2024 15:23:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.1514, 'learning_rate': 4.9436e-05, 'epoch': 0.20}
05/18/2024 15:24:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.2466, 'learning_rate': 4.9434e-05, 'epoch': 0.20}
05/18/2024 15:25:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.2234, 'learning_rate': 4.9432e-05, 'epoch': 0.20}
05/18/2024 15:26:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.1526, 'learning_rate': 4.9429e-05, 'epoch': 0.20}
05/18/2024 15:27:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.2161, 'learning_rate': 4.9427e-05, 'epoch': 0.20}
05/18/2024 15:28:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.2492, 'learning_rate': 4.9425e-05, 'epoch': 0.21}
05/18/2024 15:29:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.1799, 'learning_rate': 4.9423e-05, 'epoch': 0.21}
05/18/2024 15:30:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.2395, 'learning_rate': 4.9421e-05, 'epoch': 0.21}
05/18/2024 15:31:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.1480, 'learning_rate': 4.9418e-05, 'epoch': 0.21}
05/18/2024 15:32:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.1613, 'learning_rate': 4.9416e-05, 'epoch': 0.21}
05/18/2024 15:33:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.3184, 'learning_rate': 4.9414e-05, 'epoch': 0.21}
05/18/2024 15:34:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.2318, 'learning_rate': 4.9412e-05, 'epoch': 0.21}
05/18/2024 15:35:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.1343, 'learning_rate': 4.9409e-05, 'epoch': 0.21}
05/18/2024 15:35:15 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2600
05/18/2024 15:35:16 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 15:35:16 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 15:35:16 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2600/tokenizer_config.json
05/18/2024 15:35:16 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2600/special_tokens_map.json
05/18/2024 15:36:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.1987, 'learning_rate': 4.9407e-05, 'epoch': 0.21}
05/18/2024 15:37:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.2837, 'learning_rate': 4.9405e-05, 'epoch': 0.21}
05/18/2024 15:38:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.1382, 'learning_rate': 4.9402e-05, 'epoch': 0.21}
05/18/2024 15:39:04 - INFO - llamafactory.extras.callbacks - {'loss': 1.2325, 'learning_rate': 4.9400e-05, 'epoch': 0.21}
05/18/2024 15:40:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.1978, 'learning_rate': 4.9398e-05, 'epoch': 0.21}
05/18/2024 15:41:01 - INFO - llamafactory.extras.callbacks - {'loss': 1.1918, 'learning_rate': 4.9396e-05, 'epoch': 0.21}
05/18/2024 15:42:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.1975, 'learning_rate': 4.9393e-05, 'epoch': 0.21}
05/18/2024 15:43:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.1543, 'learning_rate': 4.9391e-05, 'epoch': 0.21}
05/18/2024 15:43:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.2034, 'learning_rate': 4.9389e-05, 'epoch': 0.21}
05/18/2024 15:44:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.1945, 'learning_rate': 4.9386e-05, 'epoch': 0.21}
05/18/2024 15:45:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.1632, 'learning_rate': 4.9384e-05, 'epoch': 0.21}
05/18/2024 15:46:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.1729, 'learning_rate': 4.9382e-05, 'epoch': 0.21}
05/18/2024 15:47:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.2980, 'learning_rate': 4.9380e-05, 'epoch': 0.21}
05/18/2024 15:48:39 - INFO - llamafactory.extras.callbacks - {'loss': 1.2179, 'learning_rate': 4.9377e-05, 'epoch': 0.21}
05/18/2024 15:49:35 - INFO - llamafactory.extras.callbacks - {'loss': 1.1750, 'learning_rate': 4.9375e-05, 'epoch': 0.21}
05/18/2024 15:50:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.1230, 'learning_rate': 4.9373e-05, 'epoch': 0.21}
05/18/2024 15:51:32 - INFO - llamafactory.extras.callbacks - {'loss': 1.2240, 'learning_rate': 4.9370e-05, 'epoch': 0.21}
05/18/2024 15:52:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.1604, 'learning_rate': 4.9368e-05, 'epoch': 0.22}
05/18/2024 15:53:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2550, 'learning_rate': 4.9366e-05, 'epoch': 0.22}
05/18/2024 15:54:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2126, 'learning_rate': 4.9363e-05, 'epoch': 0.22}
05/18/2024 15:54:22 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2700
05/18/2024 15:54:22 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 15:54:22 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 15:54:22 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2700/tokenizer_config.json
05/18/2024 15:54:22 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2700/special_tokens_map.json
05/18/2024 15:55:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.2172, 'learning_rate': 4.9361e-05, 'epoch': 0.22}
05/18/2024 15:56:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.2352, 'learning_rate': 4.9358e-05, 'epoch': 0.22}
05/18/2024 15:57:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.1805, 'learning_rate': 4.9356e-05, 'epoch': 0.22}
05/18/2024 15:58:12 - INFO - llamafactory.extras.callbacks - {'loss': 1.1901, 'learning_rate': 4.9354e-05, 'epoch': 0.22}
05/18/2024 15:59:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.2369, 'learning_rate': 4.9351e-05, 'epoch': 0.22}
05/18/2024 16:00:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.1617, 'learning_rate': 4.9349e-05, 'epoch': 0.22}
05/18/2024 16:01:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.2486, 'learning_rate': 4.9347e-05, 'epoch': 0.22}
05/18/2024 16:02:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.1869, 'learning_rate': 4.9344e-05, 'epoch': 0.22}
05/18/2024 16:03:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.2266, 'learning_rate': 4.9342e-05, 'epoch': 0.22}
05/18/2024 16:03:59 - INFO - llamafactory.extras.callbacks - {'loss': 1.2230, 'learning_rate': 4.9339e-05, 'epoch': 0.22}
05/18/2024 16:04:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.2411, 'learning_rate': 4.9337e-05, 'epoch': 0.22}
05/18/2024 16:05:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.2363, 'learning_rate': 4.9335e-05, 'epoch': 0.22}
05/18/2024 16:06:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.2627, 'learning_rate': 4.9332e-05, 'epoch': 0.22}
05/18/2024 16:07:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.1886, 'learning_rate': 4.9330e-05, 'epoch': 0.22}
05/18/2024 16:08:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.1865, 'learning_rate': 4.9327e-05, 'epoch': 0.22}
05/18/2024 16:09:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.1956, 'learning_rate': 4.9325e-05, 'epoch': 0.22}
05/18/2024 16:10:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.1883, 'learning_rate': 4.9323e-05, 'epoch': 0.22}
05/18/2024 16:11:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.2066, 'learning_rate': 4.9320e-05, 'epoch': 0.22}
05/18/2024 16:12:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.2162, 'learning_rate': 4.9318e-05, 'epoch': 0.22}
05/18/2024 16:13:32 - INFO - llamafactory.extras.callbacks - {'loss': 1.1993, 'learning_rate': 4.9315e-05, 'epoch': 0.22}
05/18/2024 16:13:32 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2800
05/18/2024 16:13:33 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 16:13:33 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 16:13:33 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2800/tokenizer_config.json
05/18/2024 16:13:33 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2800/special_tokens_map.json
05/18/2024 16:14:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.2517, 'learning_rate': 4.9313e-05, 'epoch': 0.22}
05/18/2024 16:15:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2182, 'learning_rate': 4.9310e-05, 'epoch': 0.22}
05/18/2024 16:16:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.1595, 'learning_rate': 4.9308e-05, 'epoch': 0.23}
05/18/2024 16:17:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2714, 'learning_rate': 4.9306e-05, 'epoch': 0.23}
05/18/2024 16:18:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.1958, 'learning_rate': 4.9303e-05, 'epoch': 0.23}
05/18/2024 16:19:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.2086, 'learning_rate': 4.9301e-05, 'epoch': 0.23}
05/18/2024 16:20:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.2069, 'learning_rate': 4.9298e-05, 'epoch': 0.23}
05/18/2024 16:21:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.1623, 'learning_rate': 4.9296e-05, 'epoch': 0.23}
05/18/2024 16:22:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.1087, 'learning_rate': 4.9294e-05, 'epoch': 0.23}
05/18/2024 16:23:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.1652, 'learning_rate': 4.9291e-05, 'epoch': 0.23}
05/18/2024 16:24:04 - INFO - llamafactory.extras.callbacks - {'loss': 1.1668, 'learning_rate': 4.9289e-05, 'epoch': 0.23}
05/18/2024 16:25:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.2072, 'learning_rate': 4.9286e-05, 'epoch': 0.23}
05/18/2024 16:25:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.2250, 'learning_rate': 4.9284e-05, 'epoch': 0.23}
05/18/2024 16:26:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.2435, 'learning_rate': 4.9281e-05, 'epoch': 0.23}
05/18/2024 16:27:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.1857, 'learning_rate': 4.9279e-05, 'epoch': 0.23}
05/18/2024 16:28:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.2055, 'learning_rate': 4.9276e-05, 'epoch': 0.23}
05/18/2024 16:29:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.2028, 'learning_rate': 4.9274e-05, 'epoch': 0.23}
05/18/2024 16:30:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.1848, 'learning_rate': 4.9271e-05, 'epoch': 0.23}
05/18/2024 16:31:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.1649, 'learning_rate': 4.9269e-05, 'epoch': 0.23}
05/18/2024 16:32:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.1629, 'learning_rate': 4.9266e-05, 'epoch': 0.23}
05/18/2024 16:32:38 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2900
05/18/2024 16:32:39 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 16:32:39 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 16:32:39 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2900/tokenizer_config.json
05/18/2024 16:32:39 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-2900/special_tokens_map.json
05/18/2024 16:33:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.2520, 'learning_rate': 4.9264e-05, 'epoch': 0.23}
05/18/2024 16:34:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.2134, 'learning_rate': 4.9261e-05, 'epoch': 0.23}
05/18/2024 16:35:30 - INFO - llamafactory.extras.callbacks - {'loss': 1.2394, 'learning_rate': 4.9259e-05, 'epoch': 0.23}
05/18/2024 16:36:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.2327, 'learning_rate': 4.9256e-05, 'epoch': 0.23}
05/18/2024 16:37:24 - INFO - llamafactory.extras.callbacks - {'loss': 1.2464, 'learning_rate': 4.9254e-05, 'epoch': 0.23}
05/18/2024 16:38:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.2520, 'learning_rate': 4.9251e-05, 'epoch': 0.23}
05/18/2024 16:39:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.1158, 'learning_rate': 4.9249e-05, 'epoch': 0.23}
05/18/2024 16:40:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.1969, 'learning_rate': 4.9246e-05, 'epoch': 0.24}
05/18/2024 16:41:12 - INFO - llamafactory.extras.callbacks - {'loss': 1.1670, 'learning_rate': 4.9243e-05, 'epoch': 0.24}
05/18/2024 16:42:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.1764, 'learning_rate': 4.9241e-05, 'epoch': 0.24}
05/18/2024 16:43:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.1961, 'learning_rate': 4.9238e-05, 'epoch': 0.24}
05/18/2024 16:44:03 - INFO - llamafactory.extras.callbacks - {'loss': 1.1780, 'learning_rate': 4.9236e-05, 'epoch': 0.24}
05/18/2024 16:45:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.1947, 'learning_rate': 4.9233e-05, 'epoch': 0.24}
05/18/2024 16:45:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.1895, 'learning_rate': 4.9231e-05, 'epoch': 0.24}
05/18/2024 16:46:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.1722, 'learning_rate': 4.9228e-05, 'epoch': 0.24}
05/18/2024 16:47:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.1530, 'learning_rate': 4.9225e-05, 'epoch': 0.24}
05/18/2024 16:48:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.2221, 'learning_rate': 4.9223e-05, 'epoch': 0.24}
05/18/2024 16:49:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.1816, 'learning_rate': 4.9220e-05, 'epoch': 0.24}
05/18/2024 16:50:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.2118, 'learning_rate': 4.9218e-05, 'epoch': 0.24}
05/18/2024 16:51:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.1636, 'learning_rate': 4.9215e-05, 'epoch': 0.24}
05/18/2024 16:51:36 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3000
05/18/2024 16:51:37 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 16:51:37 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 16:51:37 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3000/tokenizer_config.json
05/18/2024 16:51:37 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3000/special_tokens_map.json
05/18/2024 16:52:32 - INFO - llamafactory.extras.callbacks - {'loss': 1.1963, 'learning_rate': 4.9212e-05, 'epoch': 0.24}
05/18/2024 16:53:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.1835, 'learning_rate': 4.9210e-05, 'epoch': 0.24}
05/18/2024 16:54:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.2169, 'learning_rate': 4.9207e-05, 'epoch': 0.24}
05/18/2024 16:55:24 - INFO - llamafactory.extras.callbacks - {'loss': 1.1700, 'learning_rate': 4.9205e-05, 'epoch': 0.24}
05/18/2024 16:56:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.2224, 'learning_rate': 4.9202e-05, 'epoch': 0.24}
05/18/2024 16:57:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.2115, 'learning_rate': 4.9199e-05, 'epoch': 0.24}
05/18/2024 16:58:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2174, 'learning_rate': 4.9197e-05, 'epoch': 0.24}
05/18/2024 16:59:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.1873, 'learning_rate': 4.9194e-05, 'epoch': 0.24}
05/18/2024 17:00:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.2807, 'learning_rate': 4.9191e-05, 'epoch': 0.24}
05/18/2024 17:01:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.1782, 'learning_rate': 4.9189e-05, 'epoch': 0.24}
05/18/2024 17:02:01 - INFO - llamafactory.extras.callbacks - {'loss': 1.1536, 'learning_rate': 4.9186e-05, 'epoch': 0.24}
05/18/2024 17:02:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.2281, 'learning_rate': 4.9184e-05, 'epoch': 0.24}
05/18/2024 17:03:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.2087, 'learning_rate': 4.9181e-05, 'epoch': 0.25}
05/18/2024 17:04:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.1934, 'learning_rate': 4.9178e-05, 'epoch': 0.25}
05/18/2024 17:05:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.2278, 'learning_rate': 4.9176e-05, 'epoch': 0.25}
05/18/2024 17:06:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.1466, 'learning_rate': 4.9173e-05, 'epoch': 0.25}
05/18/2024 17:07:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.1745, 'learning_rate': 4.9170e-05, 'epoch': 0.25}
05/18/2024 17:08:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.1146, 'learning_rate': 4.9168e-05, 'epoch': 0.25}
05/18/2024 17:09:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.2117, 'learning_rate': 4.9165e-05, 'epoch': 0.25}
05/18/2024 17:10:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.1319, 'learning_rate': 4.9162e-05, 'epoch': 0.25}
05/18/2024 17:10:42 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3100
05/18/2024 17:10:43 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 17:10:43 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 17:10:43 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3100/tokenizer_config.json
05/18/2024 17:10:43 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3100/special_tokens_map.json
05/18/2024 17:11:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.1782, 'learning_rate': 4.9159e-05, 'epoch': 0.25}
05/18/2024 17:12:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.2008, 'learning_rate': 4.9157e-05, 'epoch': 0.25}
05/18/2024 17:13:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.2382, 'learning_rate': 4.9154e-05, 'epoch': 0.25}
05/18/2024 17:14:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.1880, 'learning_rate': 4.9151e-05, 'epoch': 0.25}
05/18/2024 17:15:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.0933, 'learning_rate': 4.9149e-05, 'epoch': 0.25}
05/18/2024 17:16:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.1729, 'learning_rate': 4.9146e-05, 'epoch': 0.25}
05/18/2024 17:17:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.1639, 'learning_rate': 4.9143e-05, 'epoch': 0.25}
05/18/2024 17:18:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.1930, 'learning_rate': 4.9141e-05, 'epoch': 0.25}
05/18/2024 17:19:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.2075, 'learning_rate': 4.9138e-05, 'epoch': 0.25}
05/18/2024 17:20:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.2225, 'learning_rate': 4.9135e-05, 'epoch': 0.25}
05/18/2024 17:21:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.1788, 'learning_rate': 4.9132e-05, 'epoch': 0.25}
05/18/2024 17:22:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.2139, 'learning_rate': 4.9130e-05, 'epoch': 0.25}
05/18/2024 17:22:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.1624, 'learning_rate': 4.9127e-05, 'epoch': 0.25}
05/18/2024 17:23:49 - INFO - llamafactory.extras.callbacks - {'loss': 1.2376, 'learning_rate': 4.9124e-05, 'epoch': 0.25}
05/18/2024 17:24:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.1688, 'learning_rate': 4.9121e-05, 'epoch': 0.25}
05/18/2024 17:25:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.1620, 'learning_rate': 4.9119e-05, 'epoch': 0.25}
05/18/2024 17:26:39 - INFO - llamafactory.extras.callbacks - {'loss': 1.1569, 'learning_rate': 4.9116e-05, 'epoch': 0.25}
05/18/2024 17:27:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.2192, 'learning_rate': 4.9113e-05, 'epoch': 0.26}
05/18/2024 17:28:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.2065, 'learning_rate': 4.9110e-05, 'epoch': 0.26}
05/18/2024 17:29:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.1822, 'learning_rate': 4.9108e-05, 'epoch': 0.26}
05/18/2024 17:29:26 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3200
05/18/2024 17:29:28 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 17:29:28 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 17:29:28 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3200/tokenizer_config.json
05/18/2024 17:29:28 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3200/special_tokens_map.json
05/18/2024 17:30:24 - INFO - llamafactory.extras.callbacks - {'loss': 1.2008, 'learning_rate': 4.9105e-05, 'epoch': 0.26}
05/18/2024 17:31:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.2367, 'learning_rate': 4.9102e-05, 'epoch': 0.26}
05/18/2024 17:32:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.0914, 'learning_rate': 4.9099e-05, 'epoch': 0.26}
05/18/2024 17:33:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.1597, 'learning_rate': 4.9096e-05, 'epoch': 0.26}
05/18/2024 17:34:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.1979, 'learning_rate': 4.9094e-05, 'epoch': 0.26}
05/18/2024 17:35:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.1652, 'learning_rate': 4.9091e-05, 'epoch': 0.26}
05/18/2024 17:36:04 - INFO - llamafactory.extras.callbacks - {'loss': 1.1712, 'learning_rate': 4.9088e-05, 'epoch': 0.26}
05/18/2024 17:37:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.1496, 'learning_rate': 4.9085e-05, 'epoch': 0.26}
05/18/2024 17:37:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.1812, 'learning_rate': 4.9082e-05, 'epoch': 0.26}
05/18/2024 17:38:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.1809, 'learning_rate': 4.9080e-05, 'epoch': 0.26}
05/18/2024 17:39:49 - INFO - llamafactory.extras.callbacks - {'loss': 1.1683, 'learning_rate': 4.9077e-05, 'epoch': 0.26}
05/18/2024 17:40:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.2156, 'learning_rate': 4.9074e-05, 'epoch': 0.26}
05/18/2024 17:41:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.1812, 'learning_rate': 4.9071e-05, 'epoch': 0.26}
05/18/2024 17:42:39 - INFO - llamafactory.extras.callbacks - {'loss': 1.1727, 'learning_rate': 4.9068e-05, 'epoch': 0.26}
05/18/2024 17:43:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.2014, 'learning_rate': 4.9065e-05, 'epoch': 0.26}
05/18/2024 17:44:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.1225, 'learning_rate': 4.9063e-05, 'epoch': 0.26}
05/18/2024 17:45:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.1486, 'learning_rate': 4.9060e-05, 'epoch': 0.26}
05/18/2024 17:46:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.2246, 'learning_rate': 4.9057e-05, 'epoch': 0.26}
05/18/2024 17:47:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.2018, 'learning_rate': 4.9054e-05, 'epoch': 0.26}
05/18/2024 17:48:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.1864, 'learning_rate': 4.9051e-05, 'epoch': 0.26}
05/18/2024 17:48:16 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3300
05/18/2024 17:48:17 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 17:48:17 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 17:48:17 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3300/tokenizer_config.json
05/18/2024 17:48:17 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3300/special_tokens_map.json
05/18/2024 17:49:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.1622, 'learning_rate': 4.9048e-05, 'epoch': 0.26}
05/18/2024 17:50:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2360, 'learning_rate': 4.9046e-05, 'epoch': 0.26}
05/18/2024 17:51:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.1616, 'learning_rate': 4.9043e-05, 'epoch': 0.27}
05/18/2024 17:52:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.1271, 'learning_rate': 4.9040e-05, 'epoch': 0.27}
05/18/2024 17:53:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.2407, 'learning_rate': 4.9037e-05, 'epoch': 0.27}
05/18/2024 17:53:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.2192, 'learning_rate': 4.9034e-05, 'epoch': 0.27}
05/18/2024 17:54:56 - INFO - llamafactory.extras.callbacks - {'loss': 1.1928, 'learning_rate': 4.9031e-05, 'epoch': 0.27}
05/18/2024 17:55:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.1550, 'learning_rate': 4.9028e-05, 'epoch': 0.27}
05/18/2024 17:56:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.2264, 'learning_rate': 4.9025e-05, 'epoch': 0.27}
05/18/2024 17:57:49 - INFO - llamafactory.extras.callbacks - {'loss': 1.2153, 'learning_rate': 4.9022e-05, 'epoch': 0.27}
05/18/2024 17:58:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.1692, 'learning_rate': 4.9020e-05, 'epoch': 0.27}
05/18/2024 17:59:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.2065, 'learning_rate': 4.9017e-05, 'epoch': 0.27}
05/18/2024 18:00:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.1476, 'learning_rate': 4.9014e-05, 'epoch': 0.27}
05/18/2024 18:01:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.1985, 'learning_rate': 4.9011e-05, 'epoch': 0.27}
05/18/2024 18:02:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.1644, 'learning_rate': 4.9008e-05, 'epoch': 0.27}
05/18/2024 18:03:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.1758, 'learning_rate': 4.9005e-05, 'epoch': 0.27}
05/18/2024 18:04:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.2129, 'learning_rate': 4.9002e-05, 'epoch': 0.27}
05/18/2024 18:05:32 - INFO - llamafactory.extras.callbacks - {'loss': 1.2125, 'learning_rate': 4.8999e-05, 'epoch': 0.27}
05/18/2024 18:06:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.2289, 'learning_rate': 4.8996e-05, 'epoch': 0.27}
05/18/2024 18:07:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.1816, 'learning_rate': 4.8993e-05, 'epoch': 0.27}
05/18/2024 18:07:21 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3400
05/18/2024 18:07:22 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 18:07:22 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 18:07:22 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3400/tokenizer_config.json
05/18/2024 18:07:22 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3400/special_tokens_map.json
05/18/2024 18:08:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.1429, 'learning_rate': 4.8990e-05, 'epoch': 0.27}
05/18/2024 18:09:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.1942, 'learning_rate': 4.8987e-05, 'epoch': 0.27}
05/18/2024 18:10:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.1463, 'learning_rate': 4.8984e-05, 'epoch': 0.27}
05/18/2024 18:11:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.1314, 'learning_rate': 4.8981e-05, 'epoch': 0.27}
05/18/2024 18:12:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.1952, 'learning_rate': 4.8979e-05, 'epoch': 0.27}
05/18/2024 18:13:03 - INFO - llamafactory.extras.callbacks - {'loss': 1.1891, 'learning_rate': 4.8976e-05, 'epoch': 0.27}
05/18/2024 18:14:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.1931, 'learning_rate': 4.8973e-05, 'epoch': 0.27}
05/18/2024 18:14:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.1777, 'learning_rate': 4.8970e-05, 'epoch': 0.28}
05/18/2024 18:15:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.1927, 'learning_rate': 4.8967e-05, 'epoch': 0.28}
05/18/2024 18:16:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.1411, 'learning_rate': 4.8964e-05, 'epoch': 0.28}
05/18/2024 18:17:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.2414, 'learning_rate': 4.8961e-05, 'epoch': 0.28}
05/18/2024 18:18:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.1892, 'learning_rate': 4.8958e-05, 'epoch': 0.28}
05/18/2024 18:19:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.1721, 'learning_rate': 4.8955e-05, 'epoch': 0.28}
05/18/2024 18:20:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.1903, 'learning_rate': 4.8952e-05, 'epoch': 0.28}
05/18/2024 18:21:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.2405, 'learning_rate': 4.8949e-05, 'epoch': 0.28}
05/18/2024 18:22:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.1592, 'learning_rate': 4.8946e-05, 'epoch': 0.28}
05/18/2024 18:23:30 - INFO - llamafactory.extras.callbacks - {'loss': 1.1888, 'learning_rate': 4.8943e-05, 'epoch': 0.28}
05/18/2024 18:24:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.1829, 'learning_rate': 4.8940e-05, 'epoch': 0.28}
05/18/2024 18:25:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.1498, 'learning_rate': 4.8937e-05, 'epoch': 0.28}
05/18/2024 18:26:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2141, 'learning_rate': 4.8934e-05, 'epoch': 0.28}
05/18/2024 18:26:22 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3500
05/18/2024 18:26:23 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 18:26:23 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 18:26:23 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3500/tokenizer_config.json
05/18/2024 18:26:23 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3500/special_tokens_map.json
05/18/2024 18:27:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.1658, 'learning_rate': 4.8931e-05, 'epoch': 0.28}
05/18/2024 18:28:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.2115, 'learning_rate': 4.8928e-05, 'epoch': 0.28}
05/18/2024 18:29:12 - INFO - llamafactory.extras.callbacks - {'loss': 1.1523, 'learning_rate': 4.8924e-05, 'epoch': 0.28}
05/18/2024 18:30:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.1845, 'learning_rate': 4.8921e-05, 'epoch': 0.28}
05/18/2024 18:31:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.2047, 'learning_rate': 4.8918e-05, 'epoch': 0.28}
05/18/2024 18:32:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.1957, 'learning_rate': 4.8915e-05, 'epoch': 0.28}
05/18/2024 18:33:04 - INFO - llamafactory.extras.callbacks - {'loss': 1.1987, 'learning_rate': 4.8912e-05, 'epoch': 0.28}
05/18/2024 18:34:01 - INFO - llamafactory.extras.callbacks - {'loss': 1.2090, 'learning_rate': 4.8909e-05, 'epoch': 0.28}
05/18/2024 18:34:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.1535, 'learning_rate': 4.8906e-05, 'epoch': 0.28}
05/18/2024 18:35:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.1641, 'learning_rate': 4.8903e-05, 'epoch': 0.28}
05/18/2024 18:36:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.1824, 'learning_rate': 4.8900e-05, 'epoch': 0.28}
05/18/2024 18:37:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.1722, 'learning_rate': 4.8897e-05, 'epoch': 0.28}
05/18/2024 18:38:43 - INFO - llamafactory.extras.callbacks - {'loss': 1.1279, 'learning_rate': 4.8894e-05, 'epoch': 0.29}
05/18/2024 18:39:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.2244, 'learning_rate': 4.8891e-05, 'epoch': 0.29}
05/18/2024 18:40:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.1526, 'learning_rate': 4.8888e-05, 'epoch': 0.29}
05/18/2024 18:41:32 - INFO - llamafactory.extras.callbacks - {'loss': 1.1694, 'learning_rate': 4.8885e-05, 'epoch': 0.29}
05/18/2024 18:42:30 - INFO - llamafactory.extras.callbacks - {'loss': 1.2442, 'learning_rate': 4.8882e-05, 'epoch': 0.29}
05/18/2024 18:43:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.1758, 'learning_rate': 4.8878e-05, 'epoch': 0.29}
05/18/2024 18:44:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.1729, 'learning_rate': 4.8875e-05, 'epoch': 0.29}
05/18/2024 18:45:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.1426, 'learning_rate': 4.8872e-05, 'epoch': 0.29}
05/18/2024 18:45:22 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3600
05/18/2024 18:45:23 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 18:45:23 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 18:45:23 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3600/tokenizer_config.json
05/18/2024 18:45:23 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3600/special_tokens_map.json
05/18/2024 18:46:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.2046, 'learning_rate': 4.8869e-05, 'epoch': 0.29}
05/18/2024 18:47:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2523, 'learning_rate': 4.8866e-05, 'epoch': 0.29}
05/18/2024 18:48:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.1938, 'learning_rate': 4.8863e-05, 'epoch': 0.29}
05/18/2024 18:49:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.1753, 'learning_rate': 4.8860e-05, 'epoch': 0.29}
05/18/2024 18:50:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.1480, 'learning_rate': 4.8857e-05, 'epoch': 0.29}
05/18/2024 18:50:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.2096, 'learning_rate': 4.8854e-05, 'epoch': 0.29}
05/18/2024 18:51:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.1847, 'learning_rate': 4.8850e-05, 'epoch': 0.29}
05/18/2024 18:52:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.1385, 'learning_rate': 4.8847e-05, 'epoch': 0.29}
05/18/2024 18:53:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.1568, 'learning_rate': 4.8844e-05, 'epoch': 0.29}
05/18/2024 18:54:50 - INFO - llamafactory.extras.callbacks - {'loss': 1.2424, 'learning_rate': 4.8841e-05, 'epoch': 0.29}
05/18/2024 18:55:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.2045, 'learning_rate': 4.8838e-05, 'epoch': 0.29}
05/18/2024 18:56:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.1838, 'learning_rate': 4.8835e-05, 'epoch': 0.29}
05/18/2024 18:57:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.1829, 'learning_rate': 4.8831e-05, 'epoch': 0.29}
05/18/2024 18:58:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.1409, 'learning_rate': 4.8828e-05, 'epoch': 0.29}
05/18/2024 18:59:35 - INFO - llamafactory.extras.callbacks - {'loss': 1.1710, 'learning_rate': 4.8825e-05, 'epoch': 0.29}
05/18/2024 19:00:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.2290, 'learning_rate': 4.8822e-05, 'epoch': 0.29}
05/18/2024 19:01:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2496, 'learning_rate': 4.8819e-05, 'epoch': 0.29}
05/18/2024 19:02:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.2271, 'learning_rate': 4.8816e-05, 'epoch': 0.30}
05/18/2024 19:03:24 - INFO - llamafactory.extras.callbacks - {'loss': 1.1960, 'learning_rate': 4.8812e-05, 'epoch': 0.30}
05/18/2024 19:04:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.2396, 'learning_rate': 4.8809e-05, 'epoch': 0.30}
05/18/2024 19:04:18 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3700
05/18/2024 19:04:19 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 19:04:19 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 19:04:19 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3700/tokenizer_config.json
05/18/2024 19:04:19 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3700/special_tokens_map.json
05/18/2024 19:05:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.2361, 'learning_rate': 4.8806e-05, 'epoch': 0.30}
05/18/2024 19:06:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.1698, 'learning_rate': 4.8803e-05, 'epoch': 0.30}
05/18/2024 19:07:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.2176, 'learning_rate': 4.8800e-05, 'epoch': 0.30}
05/18/2024 19:08:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.1966, 'learning_rate': 4.8796e-05, 'epoch': 0.30}
05/18/2024 19:09:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.2249, 'learning_rate': 4.8793e-05, 'epoch': 0.30}
05/18/2024 19:09:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.1471, 'learning_rate': 4.8790e-05, 'epoch': 0.30}
05/18/2024 19:10:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.1918, 'learning_rate': 4.8787e-05, 'epoch': 0.30}
05/18/2024 19:11:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.2090, 'learning_rate': 4.8784e-05, 'epoch': 0.30}
05/18/2024 19:12:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.2088, 'learning_rate': 4.8780e-05, 'epoch': 0.30}
05/18/2024 19:13:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.1447, 'learning_rate': 4.8777e-05, 'epoch': 0.30}
05/18/2024 19:14:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.2075, 'learning_rate': 4.8774e-05, 'epoch': 0.30}
05/18/2024 19:15:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.1300, 'learning_rate': 4.8771e-05, 'epoch': 0.30}
05/18/2024 19:16:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.2166, 'learning_rate': 4.8767e-05, 'epoch': 0.30}
05/18/2024 19:17:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.1571, 'learning_rate': 4.8764e-05, 'epoch': 0.30}
05/18/2024 19:18:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.1623, 'learning_rate': 4.8761e-05, 'epoch': 0.30}
05/18/2024 19:19:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2238, 'learning_rate': 4.8758e-05, 'epoch': 0.30}
05/18/2024 19:20:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.1451, 'learning_rate': 4.8754e-05, 'epoch': 0.30}
05/18/2024 19:21:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.1593, 'learning_rate': 4.8751e-05, 'epoch': 0.30}
05/18/2024 19:22:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.1864, 'learning_rate': 4.8748e-05, 'epoch': 0.30}
05/18/2024 19:23:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.1968, 'learning_rate': 4.8744e-05, 'epoch': 0.30}
05/18/2024 19:23:13 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3800
05/18/2024 19:23:14 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 19:23:14 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 19:23:14 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3800/tokenizer_config.json
05/18/2024 19:23:14 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3800/special_tokens_map.json
05/18/2024 19:24:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.2392, 'learning_rate': 4.8741e-05, 'epoch': 0.30}
05/18/2024 19:25:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.2011, 'learning_rate': 4.8738e-05, 'epoch': 0.30}
05/18/2024 19:26:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.2047, 'learning_rate': 4.8735e-05, 'epoch': 0.31}
05/18/2024 19:27:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.1488, 'learning_rate': 4.8731e-05, 'epoch': 0.31}
05/18/2024 19:27:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.1653, 'learning_rate': 4.8728e-05, 'epoch': 0.31}
05/18/2024 19:28:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.1882, 'learning_rate': 4.8725e-05, 'epoch': 0.31}
05/18/2024 19:29:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.1683, 'learning_rate': 4.8721e-05, 'epoch': 0.31}
05/18/2024 19:30:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.1298, 'learning_rate': 4.8718e-05, 'epoch': 0.31}
05/18/2024 19:31:42 - INFO - llamafactory.extras.callbacks - {'loss': 1.2049, 'learning_rate': 4.8715e-05, 'epoch': 0.31}
05/18/2024 19:32:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.1412, 'learning_rate': 4.8712e-05, 'epoch': 0.31}
05/18/2024 19:33:36 - INFO - llamafactory.extras.callbacks - {'loss': 1.1392, 'learning_rate': 4.8708e-05, 'epoch': 0.31}
05/18/2024 19:34:32 - INFO - llamafactory.extras.callbacks - {'loss': 1.2010, 'learning_rate': 4.8705e-05, 'epoch': 0.31}
05/18/2024 19:35:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.1590, 'learning_rate': 4.8702e-05, 'epoch': 0.31}
05/18/2024 19:36:26 - INFO - llamafactory.extras.callbacks - {'loss': 1.1432, 'learning_rate': 4.8698e-05, 'epoch': 0.31}
05/18/2024 19:37:23 - INFO - llamafactory.extras.callbacks - {'loss': 1.2357, 'learning_rate': 4.8695e-05, 'epoch': 0.31}
05/18/2024 19:38:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.1105, 'learning_rate': 4.8692e-05, 'epoch': 0.31}
05/18/2024 19:39:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.1857, 'learning_rate': 4.8688e-05, 'epoch': 0.31}
05/18/2024 19:40:12 - INFO - llamafactory.extras.callbacks - {'loss': 1.1087, 'learning_rate': 4.8685e-05, 'epoch': 0.31}
05/18/2024 19:41:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.2033, 'learning_rate': 4.8681e-05, 'epoch': 0.31}
05/18/2024 19:42:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.1876, 'learning_rate': 4.8678e-05, 'epoch': 0.31}
05/18/2024 19:42:05 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3900
05/18/2024 19:42:06 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 19:42:06 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 19:42:06 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3900/tokenizer_config.json
05/18/2024 19:42:06 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-3900/special_tokens_map.json
05/18/2024 19:43:03 - INFO - llamafactory.extras.callbacks - {'loss': 1.2316, 'learning_rate': 4.8675e-05, 'epoch': 0.31}
05/18/2024 19:44:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.2264, 'learning_rate': 4.8671e-05, 'epoch': 0.31}
05/18/2024 19:44:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.1672, 'learning_rate': 4.8668e-05, 'epoch': 0.31}
05/18/2024 19:45:54 - INFO - llamafactory.extras.callbacks - {'loss': 1.2037, 'learning_rate': 4.8665e-05, 'epoch': 0.31}
05/18/2024 19:46:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.1220, 'learning_rate': 4.8661e-05, 'epoch': 0.31}
05/18/2024 19:47:49 - INFO - llamafactory.extras.callbacks - {'loss': 1.1442, 'learning_rate': 4.8658e-05, 'epoch': 0.31}
05/18/2024 19:48:46 - INFO - llamafactory.extras.callbacks - {'loss': 1.1647, 'learning_rate': 4.8655e-05, 'epoch': 0.31}
05/18/2024 19:49:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.1515, 'learning_rate': 4.8651e-05, 'epoch': 0.32}
05/18/2024 19:50:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.2137, 'learning_rate': 4.8648e-05, 'epoch': 0.32}
05/18/2024 19:51:39 - INFO - llamafactory.extras.callbacks - {'loss': 1.1752, 'learning_rate': 4.8644e-05, 'epoch': 0.32}
05/18/2024 19:52:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.1821, 'learning_rate': 4.8641e-05, 'epoch': 0.32}
05/18/2024 19:53:32 - INFO - llamafactory.extras.callbacks - {'loss': 1.2574, 'learning_rate': 4.8638e-05, 'epoch': 0.32}
05/18/2024 19:54:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.1735, 'learning_rate': 4.8634e-05, 'epoch': 0.32}
05/18/2024 19:55:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.1464, 'learning_rate': 4.8631e-05, 'epoch': 0.32}
05/18/2024 19:56:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2239, 'learning_rate': 4.8627e-05, 'epoch': 0.32}
05/18/2024 19:57:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.2619, 'learning_rate': 4.8624e-05, 'epoch': 0.32}
05/18/2024 19:58:12 - INFO - llamafactory.extras.callbacks - {'loss': 1.1265, 'learning_rate': 4.8620e-05, 'epoch': 0.32}
05/18/2024 19:59:08 - INFO - llamafactory.extras.callbacks - {'loss': 1.1522, 'learning_rate': 4.8617e-05, 'epoch': 0.32}
05/18/2024 20:00:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.1543, 'learning_rate': 4.8614e-05, 'epoch': 0.32}
05/18/2024 20:01:00 - INFO - llamafactory.extras.callbacks - {'loss': 1.1363, 'learning_rate': 4.8610e-05, 'epoch': 0.32}
05/18/2024 20:01:00 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4000
05/18/2024 20:01:01 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 20:01:01 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 20:01:01 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4000/tokenizer_config.json
05/18/2024 20:01:01 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4000/special_tokens_map.json
05/18/2024 20:01:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.2319, 'learning_rate': 4.8607e-05, 'epoch': 0.32}
05/18/2024 20:02:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.2544, 'learning_rate': 4.8603e-05, 'epoch': 0.32}
05/18/2024 20:03:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.1738, 'learning_rate': 4.8600e-05, 'epoch': 0.32}
05/18/2024 20:04:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.1540, 'learning_rate': 4.8596e-05, 'epoch': 0.32}
05/18/2024 20:05:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.1987, 'learning_rate': 4.8593e-05, 'epoch': 0.32}
05/18/2024 20:06:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.1879, 'learning_rate': 4.8589e-05, 'epoch': 0.32}
05/18/2024 20:07:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.1740, 'learning_rate': 4.8586e-05, 'epoch': 0.32}
05/18/2024 20:08:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.2052, 'learning_rate': 4.8582e-05, 'epoch': 0.32}
05/18/2024 20:09:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.1631, 'learning_rate': 4.8579e-05, 'epoch': 0.32}
05/18/2024 20:10:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.1798, 'learning_rate': 4.8575e-05, 'epoch': 0.32}
05/18/2024 20:11:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2269, 'learning_rate': 4.8572e-05, 'epoch': 0.32}
05/18/2024 20:12:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.1528, 'learning_rate': 4.8568e-05, 'epoch': 0.32}
05/18/2024 20:13:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.1459, 'learning_rate': 4.8565e-05, 'epoch': 0.33}
05/18/2024 20:14:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.2153, 'learning_rate': 4.8561e-05, 'epoch': 0.33}
05/18/2024 20:15:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.1898, 'learning_rate': 4.8558e-05, 'epoch': 0.33}
05/18/2024 20:16:06 - INFO - llamafactory.extras.callbacks - {'loss': 1.1493, 'learning_rate': 4.8554e-05, 'epoch': 0.33}
05/18/2024 20:17:01 - INFO - llamafactory.extras.callbacks - {'loss': 1.1801, 'learning_rate': 4.8551e-05, 'epoch': 0.33}
05/18/2024 20:17:56 - INFO - llamafactory.extras.callbacks - {'loss': 1.1678, 'learning_rate': 4.8547e-05, 'epoch': 0.33}
05/18/2024 20:18:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.1672, 'learning_rate': 4.8544e-05, 'epoch': 0.33}
05/18/2024 20:19:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.1990, 'learning_rate': 4.8540e-05, 'epoch': 0.33}
05/18/2024 20:19:51 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4100
05/18/2024 20:19:52 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 20:19:52 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 20:19:52 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4100/tokenizer_config.json
05/18/2024 20:19:52 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4100/special_tokens_map.json
05/18/2024 20:20:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.1480, 'learning_rate': 4.8537e-05, 'epoch': 0.33}
05/18/2024 20:21:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.1311, 'learning_rate': 4.8533e-05, 'epoch': 0.33}
05/18/2024 20:22:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.1435, 'learning_rate': 4.8530e-05, 'epoch': 0.33}
05/18/2024 20:23:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.2346, 'learning_rate': 4.8526e-05, 'epoch': 0.33}
05/18/2024 20:24:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.1450, 'learning_rate': 4.8523e-05, 'epoch': 0.33}
05/18/2024 20:25:30 - INFO - llamafactory.extras.callbacks - {'loss': 1.1523, 'learning_rate': 4.8519e-05, 'epoch': 0.33}
05/18/2024 20:26:27 - INFO - llamafactory.extras.callbacks - {'loss': 1.1713, 'learning_rate': 4.8516e-05, 'epoch': 0.33}
05/18/2024 20:27:24 - INFO - llamafactory.extras.callbacks - {'loss': 1.2128, 'learning_rate': 4.8512e-05, 'epoch': 0.33}
05/18/2024 20:28:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.1961, 'learning_rate': 4.8509e-05, 'epoch': 0.33}
05/18/2024 20:29:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.2366, 'learning_rate': 4.8505e-05, 'epoch': 0.33}
05/18/2024 20:30:15 - INFO - llamafactory.extras.callbacks - {'loss': 1.1614, 'learning_rate': 4.8501e-05, 'epoch': 0.33}
05/18/2024 20:31:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.2314, 'learning_rate': 4.8498e-05, 'epoch': 0.33}
05/18/2024 20:32:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.1667, 'learning_rate': 4.8494e-05, 'epoch': 0.33}
05/18/2024 20:33:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.1580, 'learning_rate': 4.8491e-05, 'epoch': 0.33}
05/18/2024 20:33:58 - INFO - llamafactory.extras.callbacks - {'loss': 1.2126, 'learning_rate': 4.8487e-05, 'epoch': 0.33}
05/18/2024 20:34:56 - INFO - llamafactory.extras.callbacks - {'loss': 1.2560, 'learning_rate': 4.8483e-05, 'epoch': 0.33}
05/18/2024 20:35:53 - INFO - llamafactory.extras.callbacks - {'loss': 1.2197, 'learning_rate': 4.8480e-05, 'epoch': 0.33}
05/18/2024 20:36:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.1577, 'learning_rate': 4.8476e-05, 'epoch': 0.34}
05/18/2024 20:37:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.1881, 'learning_rate': 4.8473e-05, 'epoch': 0.34}
05/18/2024 20:38:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.1362, 'learning_rate': 4.8469e-05, 'epoch': 0.34}
05/18/2024 20:38:41 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4200
05/18/2024 20:38:42 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 20:38:42 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 20:38:42 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4200/tokenizer_config.json
05/18/2024 20:38:42 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4200/special_tokens_map.json
05/18/2024 20:39:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.1839, 'learning_rate': 4.8465e-05, 'epoch': 0.34}
05/18/2024 20:40:34 - INFO - llamafactory.extras.callbacks - {'loss': 1.1522, 'learning_rate': 4.8462e-05, 'epoch': 0.34}
05/18/2024 20:41:30 - INFO - llamafactory.extras.callbacks - {'loss': 1.1357, 'learning_rate': 4.8458e-05, 'epoch': 0.34}
05/18/2024 20:42:25 - INFO - llamafactory.extras.callbacks - {'loss': 1.2319, 'learning_rate': 4.8455e-05, 'epoch': 0.34}
05/18/2024 20:43:21 - INFO - llamafactory.extras.callbacks - {'loss': 1.1670, 'learning_rate': 4.8451e-05, 'epoch': 0.34}
05/18/2024 20:44:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.1926, 'learning_rate': 4.8447e-05, 'epoch': 0.34}
05/18/2024 20:45:11 - INFO - llamafactory.extras.callbacks - {'loss': 1.1144, 'learning_rate': 4.8444e-05, 'epoch': 0.34}
05/18/2024 20:46:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.1776, 'learning_rate': 4.8440e-05, 'epoch': 0.34}
05/18/2024 20:47:03 - INFO - llamafactory.extras.callbacks - {'loss': 1.2146, 'learning_rate': 4.8436e-05, 'epoch': 0.34}
05/18/2024 20:48:01 - INFO - llamafactory.extras.callbacks - {'loss': 1.1603, 'learning_rate': 4.8433e-05, 'epoch': 0.34}
05/18/2024 20:48:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.2679, 'learning_rate': 4.8429e-05, 'epoch': 0.34}
05/18/2024 20:49:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.1717, 'learning_rate': 4.8425e-05, 'epoch': 0.34}
05/18/2024 20:50:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.1264, 'learning_rate': 4.8422e-05, 'epoch': 0.34}
05/18/2024 20:51:43 - INFO - llamafactory.extras.callbacks - {'loss': 1.1899, 'learning_rate': 4.8418e-05, 'epoch': 0.34}
05/18/2024 20:52:39 - INFO - llamafactory.extras.callbacks - {'loss': 1.2029, 'learning_rate': 4.8414e-05, 'epoch': 0.34}
05/18/2024 20:53:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.1751, 'learning_rate': 4.8411e-05, 'epoch': 0.34}
05/18/2024 20:54:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.1801, 'learning_rate': 4.8407e-05, 'epoch': 0.34}
05/18/2024 20:55:29 - INFO - llamafactory.extras.callbacks - {'loss': 1.1458, 'learning_rate': 4.8403e-05, 'epoch': 0.34}
05/18/2024 20:56:23 - INFO - llamafactory.extras.callbacks - {'loss': 1.1537, 'learning_rate': 4.8400e-05, 'epoch': 0.34}
05/18/2024 20:57:19 - INFO - llamafactory.extras.callbacks - {'loss': 1.1974, 'learning_rate': 4.8396e-05, 'epoch': 0.34}
05/18/2024 20:57:19 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4300
05/18/2024 20:57:20 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 20:57:20 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 20:57:20 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4300/tokenizer_config.json
05/18/2024 20:57:20 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4300/special_tokens_map.json
05/18/2024 20:58:18 - INFO - llamafactory.extras.callbacks - {'loss': 1.2229, 'learning_rate': 4.8392e-05, 'epoch': 0.34}
05/18/2024 20:59:14 - INFO - llamafactory.extras.callbacks - {'loss': 1.1466, 'learning_rate': 4.8389e-05, 'epoch': 0.34}
05/18/2024 21:00:09 - INFO - llamafactory.extras.callbacks - {'loss': 1.1838, 'learning_rate': 4.8385e-05, 'epoch': 0.35}
05/18/2024 21:01:05 - INFO - llamafactory.extras.callbacks - {'loss': 1.1846, 'learning_rate': 4.8381e-05, 'epoch': 0.35}
05/18/2024 21:02:03 - INFO - llamafactory.extras.callbacks - {'loss': 1.1510, 'learning_rate': 4.8378e-05, 'epoch': 0.35}
05/18/2024 21:02:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.3165, 'learning_rate': 4.8374e-05, 'epoch': 0.35}
05/18/2024 21:03:55 - INFO - llamafactory.extras.callbacks - {'loss': 1.2241, 'learning_rate': 4.8370e-05, 'epoch': 0.35}
05/18/2024 21:04:51 - INFO - llamafactory.extras.callbacks - {'loss': 1.1173, 'learning_rate': 4.8366e-05, 'epoch': 0.35}
05/18/2024 21:05:47 - INFO - llamafactory.extras.callbacks - {'loss': 1.1775, 'learning_rate': 4.8363e-05, 'epoch': 0.35}
05/18/2024 21:06:44 - INFO - llamafactory.extras.callbacks - {'loss': 1.1773, 'learning_rate': 4.8359e-05, 'epoch': 0.35}
05/18/2024 21:07:41 - INFO - llamafactory.extras.callbacks - {'loss': 1.2331, 'learning_rate': 4.8355e-05, 'epoch': 0.35}
05/18/2024 21:08:40 - INFO - llamafactory.extras.callbacks - {'loss': 1.1959, 'learning_rate': 4.8351e-05, 'epoch': 0.35}
05/18/2024 21:09:37 - INFO - llamafactory.extras.callbacks - {'loss': 1.2451, 'learning_rate': 4.8348e-05, 'epoch': 0.35}
05/18/2024 21:10:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.1905, 'learning_rate': 4.8344e-05, 'epoch': 0.35}
05/18/2024 21:11:31 - INFO - llamafactory.extras.callbacks - {'loss': 1.2092, 'learning_rate': 4.8340e-05, 'epoch': 0.35}
05/18/2024 21:12:28 - INFO - llamafactory.extras.callbacks - {'loss': 1.1645, 'learning_rate': 4.8337e-05, 'epoch': 0.35}
05/18/2024 21:13:22 - INFO - llamafactory.extras.callbacks - {'loss': 1.2233, 'learning_rate': 4.8333e-05, 'epoch': 0.35}
05/18/2024 21:14:20 - INFO - llamafactory.extras.callbacks - {'loss': 1.1685, 'learning_rate': 4.8329e-05, 'epoch': 0.35}
05/18/2024 21:15:17 - INFO - llamafactory.extras.callbacks - {'loss': 1.2059, 'learning_rate': 4.8325e-05, 'epoch': 0.35}
05/18/2024 21:16:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.1512, 'learning_rate': 4.8321e-05, 'epoch': 0.35}
05/18/2024 21:16:16 - INFO - transformers.trainer - Saving model checkpoint to saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4400
05/18/2024 21:16:17 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /home/featurize/.cache/huggingface/hub/models--mistralai--Mistral-7B-Instruct-v0.2/snapshots/41b61a33a2483885c981aa79e0df6b32407ed873/config.json
05/18/2024 21:16:17 - INFO - transformers.configuration_utils - Model config MistralConfig {
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": true,
"vocab_size": 32000
}
05/18/2024 21:16:17 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4400/tokenizer_config.json
05/18/2024 21:16:17 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Mistral-7B-v0.2-Chat/lora/train_2024-05-18-07-31-02/checkpoint-4400/special_tokens_map.json
05/18/2024 21:17:16 - INFO - llamafactory.extras.callbacks - {'loss': 1.1605, 'learning_rate': 4.8318e-05, 'epoch': 0.35}
05/18/2024 21:18:13 - INFO - llamafactory.extras.callbacks - {'loss': 1.2200, 'learning_rate': 4.8314e-05, 'epoch': 0.35}
05/18/2024 21:19:10 - INFO - llamafactory.extras.callbacks - {'loss': 1.1954, 'learning_rate': 4.8310e-05, 'epoch': 0.35}
05/18/2024 21:20:07 - INFO - llamafactory.extras.callbacks - {'loss': 1.1297, 'learning_rate': 4.8306e-05, 'epoch': 0.35}
05/18/2024 21:21:02 - INFO - llamafactory.extras.callbacks - {'loss': 1.1316, 'learning_rate': 4.8303e-05, 'epoch': 0.35}
05/18/2024 21:21:59 - INFO - llamafactory.extras.callbacks - {'loss': 1.1734, 'learning_rate': 4.8299e-05, 'epoch': 0.35}
05/18/2024 21:22:57 - INFO - llamafactory.extras.callbacks - {'loss': 1.1248, 'learning_rate': 4.8295e-05, 'epoch': 0.35}
05/18/2024 21:23:56 - INFO - llamafactory.extras.callbacks - {'loss': 1.2097, 'learning_rate': 4.8291e-05, 'epoch': 0.36}
05/18/2024 21:24:52 - INFO - llamafactory.extras.callbacks - {'loss': 1.1575, 'learning_rate': 4.8287e-05, 'epoch': 0.36}
05/18/2024 21:25:48 - INFO - llamafactory.extras.callbacks - {'loss': 1.1645, 'learning_rate': 4.8284e-05, 'epoch': 0.36}
05/18/2024 21:26:45 - INFO - llamafactory.extras.callbacks - {'loss': 1.1963, 'learning_rate': 4.8280e-05, 'epoch': 0.36}
05/18/2024 21:27:39 - INFO - llamafactory.extras.callbacks - {'loss': 1.1711, 'learning_rate': 4.8276e-05, 'epoch': 0.36}
05/18/2024 21:28:38 - INFO - llamafactory.extras.callbacks - {'loss': 1.1556, 'learning_rate': 4.8272e-05, 'epoch': 0.36}
05/18/2024 21:29:33 - INFO - llamafactory.extras.callbacks - {'loss': 1.1733, 'learning_rate': 4.8268e-05, 'epoch': 0.36}