Word2Li
/

Llama3.1-8B-Middo-Wizard

Text Generation

text-generation-inference

Model card Files Files and versions

Llama3.1-8B-Middo-Wizard / README.md

Word2Li's picture

Update README.md

5d33a7f verified about 2 months ago

|

history blame contribute delete

3.67 kB

	---
	library_name: transformers
	license: llama3.1
	base_model: meta-llama/Llama-3.1-8B
	language: en
	datasets:
	- Word2Li/MiddOptimized
	tags:
	- llama-factory
	- full
	pipeline_tag: text-generation
	model-index:
	- name: Llama3.1-8B-Middo-Wizard
	results:
	- task:
	type: text-generation
	dataset:
	name: MMLU
	type: MMLU
	metrics:
	- name: weighted accuracy
	type: weighted accuracy
	value: 48.39
	verified: true
	- task:
	type: text-generation
	dataset:
	name: IFEval
	type: IFEval
	metrics:
	- name: overall accuracy
	type: overall accuracy
	value: 50.11
	verified: true
	- task:
	type: text-generation
	dataset:
	name: GSM8K
	type: GSM8K
	metrics:
	- name: accuracy
	type: accuracy
	value: 54.44
	verified: true
	- task:
	type: text-generation
	dataset:
	name: MATH
	type: MATH
	metrics:
	- name: accuracy
	type: accuracy
	value: 13.80
	verified: true
	- task:
	type: text-generation
	dataset:
	name: HumanEval
	type: HumanEval
	metrics:
	- name: humaneval_pass@1
	type: humaneval_pass@1
	value: 46.95
	verified: true
	- task:
	type: text-generation
	dataset:
	name: MBPP
	type: MBPP
	metrics:
	- name: score
	type: score
	value: 45.00
	verified: true
	- task:
	type: text-generation
	dataset:
	name: Hellaswag
	type: Hellaswag
	metrics:
	- name: accuracy
	type: accuracy
	value: 63.54
	verified: true
	- task:
	type: text-generation
	dataset:
	name: GPQA
	type: GPQA
	metrics:
	- name: accuracy
	type: accuracy
	value: 20.20
	verified: true
	metrics:
	- accuracy
	---

	# Llama3.1-8B-Middo-Wizard

	Paper: [Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning](https://arxiv.org/abs/2508.21589)

	Code: https://github.com/Word2VecT/Middo

	## Model description

	This model is a fine-tuned version of [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) on the [MiddOptimzed/llama_wizard](https://huggingface.co/datasets/Word2Li/MiddOptimized/viewer/default/llama_alpaca) dataset.

	## Training and evaluation data

	### Training data

	Middo optimized [WizardLM_evol_instruct_70k](https://huggingface.co/datasets/WizardLMTeam/WizardLM_evol_instruct_70k) on [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B).

	### Evaluation data

	- General
	- MMLU
	- IFEval
	- Math
	- GSM8K
	- MATH
	- Code
	- HumanEval
	- MBPP
	- Reasoning
	- Hellaswag
	- GPQA

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:

	- learning_rate: 2e-05
	- train_batch_size: 4
	- eval_batch_size: 8
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 8
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 256
	- total_eval_batch_size: 64
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.03
	- num_epochs: 1.0

	### Training results

	- epoch: 0.9973935708079931
	- total_flos: 2.698045158024282e + 18
	- train_loss: 0.5919382667707649
	- train_runtime: 4471.5794
	- train_samples_per_second: 16.469
	- train_steps_per_second: 0.064

	### Framework versions

	- Transformers 4.45.2
	- Pytorch 2.5.1+cu121
	- Datasets 2.21.0
	- Tokenizers 0.20.1