--- base_model: meta-llama/Llama-3.2-1B-Instruct library_name: peft tags: - base_model:adapter:meta-llama/Llama-3.2-1B-Instruct - lora - transformers license: apache-2.0 datasets: - nyu-mll/glue language: - en metrics: - accuracy - f1 - matthews_correlation pipeline_tag: text-classification --- # MNLI - LLaMA 3.2 1B - QLoRA (10k subset, 4-bit) ## Model Summary This is a **QLoRA fine-tuned** version of [`meta-llama/Llama-3.2-1B-Instruct`](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) on the **MNLI** (Multi-Genre Natural Language Inference) dataset from [GLUE](https://huggingface.co/datasets/glue/viewer/mnli). - **Base model:** LLaMA 3.2 1B Instruct - **Fine-tuning method:** [QLoRA](https://arxiv.org/abs/2305.14314) with 4-bit quantization - **Train subset:** 10k samples (8k train / 1k val / 1k test from train split) - **Evaluation:** Official GLUE dev sets (matched / mismatched) + held-out 1k test split - **Trainable parameters:** 5.64M (0.45% of base model) - **Hardware:** NVIDIA T4 (fp16) ⚠️ **Note:** This repo contains only the **LoRA adapter weights**. You need access to the base model from Meta to use it. --- ## Usage ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification, BitsAndBytesConfig from peft import PeftModel import torch BASE = "meta-llama/Llama-3.2-1B-Instruct" ADAPTER = "streetelite/mnli-llama3.2-1b-qlora-10k" bnb = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.float16, bnb_4bit_use_double_quant=True ) tokenizer = AutoTokenizer.from_pretrained(ADAPTER) model = AutoModelForSequenceClassification.from_pretrained( BASE, num_labels=3, quantization_config=bnb, torch_dtype=torch.float16, device_map="auto" ) model = PeftModel.from_pretrained(model, ADAPTER).eval() inputs = tokenizer( "A man is playing guitar.", "A person is making music.", return_tensors="pt", truncation=True ) with torch.inference_mode(): logits = model(**{k: v.to(model.device) for k, v in inputs.items()}).logits probs = logits.softmax(-1) print(probs) ``` ## Results ### GLUE dev (official) | Set | Accuracy | F1 (macro) | F1 (weighted) | MCC | Kappa | MAE | |------------|----------|------------|---------------|--------|--------|--------| | Matched | 82.37% | 0.8210 | 0.8224 | 0.7358 | 0.7349 | 0.2068 | | Mismatched | 83.71% | 0.8348 | 0.8360 | 0.7558 | 0.7550 | 0.1894 | --- ### Held-out test split (1k from train) | Accuracy | F1 (macro) | F1 (weighted) | MCC | Kappa | MAE | |----------|------------|---------------|--------|--------|--------| | 83.10% | 0.8280 | 0.8288 | 0.7496 | 0.7461 | 0.2010 | --- ## Training Details - **Framework:** Hugging Face Transformers + PEFT + bitsandbytes - **Quantization:** 4-bit NF4 w/ double quantization - **LoRA config:** r=8, alpha=16, dropout=0.1, target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` - **Optimizer:** paged_adamw_8bit, lr=2e-4 - **Batch size:** 4 (gradient accumulation = 4 → effective 16) - **Epochs:** 2 - **Seed:** 42 - **Padding:** dynamic --- ## Intended Uses - **Primary:** Natural language inference on text pairs (entailment, neutral, contradiction). - **Languages:** English. - **Not intended for:** non-English inputs, factual question answering, safety-critical applications without human review. --- ## License - **Base model:** LLaMA 3.2 1B Instruct — [Meta license](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) - **Adapter:** Apache License 2.0