mnli-llama3.2-1b-qlora-10k / README.md

strectelite

Update README.md

f6241fe verified 4 months ago

preview code

raw

history blame contribute delete

3.68 kB

metadata

base_model: meta-llama/Llama-3.2-1B-Instruct
library_name: peft
tags:
  - base_model:adapter:meta-llama/Llama-3.2-1B-Instruct
  - lora
  - transformers
license: apache-2.0
datasets:
  - nyu-mll/glue
language:
  - en
metrics:
  - accuracy
  - f1
  - matthews_correlation
pipeline_tag: text-classification

MNLI - LLaMA 3.2 1B - QLoRA (10k subset, 4-bit)

Model Summary

This is a QLoRA fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the MNLI (Multi-Genre Natural Language Inference) dataset from GLUE.

Base model: LLaMA 3.2 1B Instruct
Fine-tuning method: QLoRA with 4-bit quantization
Train subset: 10k samples (8k train / 1k val / 1k test from train split)
Evaluation: Official GLUE dev sets (matched / mismatched) + held-out 1k test split
Trainable parameters: 5.64M (0.45% of base model)
Hardware: NVIDIA T4 (fp16)

⚠️ Note: This repo contains only the LoRA adapter weights. You need access to the base model from Meta to use it.

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification, BitsAndBytesConfig
from peft import PeftModel
import torch

BASE = "meta-llama/Llama-3.2-1B-Instruct"
ADAPTER = "streetelite/mnli-llama3.2-1b-qlora-10k"

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True
)

tokenizer = AutoTokenizer.from_pretrained(ADAPTER)
model = AutoModelForSequenceClassification.from_pretrained(
    BASE,
    num_labels=3,
    quantization_config=bnb,
    torch_dtype=torch.float16,
    device_map="auto"
)
model = PeftModel.from_pretrained(model, ADAPTER).eval()

inputs = tokenizer(
    "A man is playing guitar.",
    "A person is making music.",
    return_tensors="pt",
    truncation=True
)
with torch.inference_mode():
    logits = model(**{k: v.to(model.device) for k, v in inputs.items()}).logits
    probs = logits.softmax(-1)
print(probs)

Results

GLUE dev (official)

Set	Accuracy	F1 (macro)	F1 (weighted)	MCC	Kappa	MAE
Matched	82.37%	0.8210	0.8224	0.7358	0.7349	0.2068
Mismatched	83.71%	0.8348	0.8360	0.7558	0.7550	0.1894

Held-out test split (1k from train)

Accuracy	F1 (macro)	F1 (weighted)	MCC	Kappa	MAE
83.10%	0.8280	0.8288	0.7496	0.7461	0.2010

Training Details

Framework: Hugging Face Transformers + PEFT + bitsandbytes
Quantization: 4-bit NF4 w/ double quantization
LoRA config: r=8, alpha=16, dropout=0.1, target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Optimizer: paged_adamw_8bit, lr=2e-4
Batch size: 4 (gradient accumulation = 4 → effective 16)
Epochs: 2
Seed: 42
Padding: dynamic

Intended Uses

Primary: Natural language inference on text pairs (entailment, neutral, contradiction).
Languages: English.
Not intended for: non-English inputs, factual question answering, safety-critical applications without human review.

License

Base model: LLaMA 3.2 1B Instruct — Meta license
Adapter: Apache License 2.0