strectelite's picture
Update README.md
f6241fe verified
---
base_model: meta-llama/Llama-3.2-1B-Instruct
library_name: peft
tags:
- base_model:adapter:meta-llama/Llama-3.2-1B-Instruct
- lora
- transformers
license: apache-2.0
datasets:
- nyu-mll/glue
language:
- en
metrics:
- accuracy
- f1
- matthews_correlation
pipeline_tag: text-classification
---
# MNLI - LLaMA 3.2 1B - QLoRA (10k subset, 4-bit)
## Model Summary
This is a **QLoRA fine-tuned** version of [`meta-llama/Llama-3.2-1B-Instruct`](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) on the **MNLI** (Multi-Genre Natural Language Inference) dataset from [GLUE](https://huggingface.co/datasets/glue/viewer/mnli).
- **Base model:** LLaMA 3.2 1B Instruct
- **Fine-tuning method:** [QLoRA](https://arxiv.org/abs/2305.14314) with 4-bit quantization
- **Train subset:** 10k samples (8k train / 1k val / 1k test from train split)
- **Evaluation:** Official GLUE dev sets (matched / mismatched) + held-out 1k test split
- **Trainable parameters:** 5.64M (0.45% of base model)
- **Hardware:** NVIDIA T4 (fp16)
⚠️ **Note:** This repo contains only the **LoRA adapter weights**. You need access to the base model from Meta to use it.
---
## Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification, BitsAndBytesConfig
from peft import PeftModel
import torch
BASE = "meta-llama/Llama-3.2-1B-Instruct"
ADAPTER = "streetelite/mnli-llama3.2-1b-qlora-10k"
bnb = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True
)
tokenizer = AutoTokenizer.from_pretrained(ADAPTER)
model = AutoModelForSequenceClassification.from_pretrained(
BASE,
num_labels=3,
quantization_config=bnb,
torch_dtype=torch.float16,
device_map="auto"
)
model = PeftModel.from_pretrained(model, ADAPTER).eval()
inputs = tokenizer(
"A man is playing guitar.",
"A person is making music.",
return_tensors="pt",
truncation=True
)
with torch.inference_mode():
logits = model(**{k: v.to(model.device) for k, v in inputs.items()}).logits
probs = logits.softmax(-1)
print(probs)
```
## Results
### GLUE dev (official)
| Set | Accuracy | F1 (macro) | F1 (weighted) | MCC | Kappa | MAE |
|------------|----------|------------|---------------|--------|--------|--------|
| Matched | 82.37% | 0.8210 | 0.8224 | 0.7358 | 0.7349 | 0.2068 |
| Mismatched | 83.71% | 0.8348 | 0.8360 | 0.7558 | 0.7550 | 0.1894 |
---
### Held-out test split (1k from train)
| Accuracy | F1 (macro) | F1 (weighted) | MCC | Kappa | MAE |
|----------|------------|---------------|--------|--------|--------|
| 83.10% | 0.8280 | 0.8288 | 0.7496 | 0.7461 | 0.2010 |
---
## Training Details
- **Framework:** Hugging Face Transformers + PEFT + bitsandbytes
- **Quantization:** 4-bit NF4 w/ double quantization
- **LoRA config:** r=8, alpha=16, dropout=0.1, target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
- **Optimizer:** paged_adamw_8bit, lr=2e-4
- **Batch size:** 4 (gradient accumulation = 4 → effective 16)
- **Epochs:** 2
- **Seed:** 42
- **Padding:** dynamic
---
## Intended Uses
- **Primary:** Natural language inference on text pairs (entailment, neutral, contradiction).
- **Languages:** English.
- **Not intended for:** non-English inputs, factual question answering, safety-critical applications without human review.
---
## License
- **Base model:** LLaMA 3.2 1B Instruct — [Meta license](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
- **Adapter:** Apache License 2.0