---
library_name: transformers
license: llama3.1
base_model: meta-llama/Llama-3.1-8B
language: en
datasets:
- Word2Li/MiddOptimized
tags:
- llama-factory
- full
pipeline_tag: text-generation
model-index:
- name: Llama3.1-8B-Middo-Wizard
  results:
    - task:
        type: text-generation
      dataset:
        name: MMLU
        type: MMLU
      metrics:
        - name: weighted accuracy
          type: weighted accuracy
          value: 48.39
          verified: true
    - task:
        type: text-generation
      dataset:
        name: IFEval
        type: IFEval
      metrics:
        - name: overall accuracy
          type: overall accuracy
          value: 50.11
          verified: true
    - task:
        type: text-generation
      dataset:
        name: GSM8K
        type: GSM8K
      metrics:
        - name: accuracy
          type: accuracy
          value: 54.44
          verified: true
    - task:
        type: text-generation
      dataset:
        name: MATH
        type: MATH
      metrics:
        - name: accuracy
          type: accuracy
          value: 13.80
          verified: true
    - task:
        type: text-generation
      dataset:
        name: HumanEval
        type: HumanEval
      metrics:
        - name: humaneval_pass@1
          type: humaneval_pass@1
          value: 46.95
          verified: true
    - task:
        type: text-generation
      dataset:
        name: MBPP
        type: MBPP
      metrics:
        - name: score
          type: score
          value: 45.00
          verified: true
    - task:
        type: text-generation
      dataset:
        name: Hellaswag
        type: Hellaswag
      metrics:
        - name: accuracy
          type: accuracy
          value: 63.54
          verified: true
    - task:
        type: text-generation
      dataset:
        name: GPQA
        type: GPQA
      metrics:
        - name: accuracy
          type: accuracy
          value: 20.20
          verified: true
metrics:
- accuracy
---

# Llama3.1-8B-Middo-Wizard

Paper: [Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning](https://arxiv.org/abs/2508.21589)

Code: https://github.com/Word2VecT/Middo

## Model description

This model is a fine-tuned version of [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) on the [MiddOptimzed/llama_wizard](https://huggingface.co/datasets/Word2Li/MiddOptimized/viewer/default/llama_alpaca) dataset.

## Training and evaluation data

### Training data

Middo optimized [WizardLM_evol_instruct_70k](https://huggingface.co/datasets/WizardLMTeam/WizardLM_evol_instruct_70k) on [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B).

### Evaluation data

- General
  - MMLU
  - IFEval
- Math
  - GSM8K
  - MATH
- Code
  - HumanEval
  - MBPP
- Reasoning
  - Hellaswag
  - GPQA

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:

- learning_rate: 2e-05
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 1.0

### Training results

- epoch: 0.9973935708079931
- total_flos: 2.698045158024282e + 18
- train_loss: 0.5919382667707649
- train_runtime: 4471.5794
- train_samples_per_second: 16.469
- train_steps_per_second: 0.064

### Framework versions

- Transformers 4.45.2
- Pytorch 2.5.1+cu121
- Datasets 2.21.0
- Tokenizers 0.20.1