--- license: mit tags: - language-model - instruction-tuning - lora - adalora - qlora - tinyllama - text-generation --- # 🦙 TinyLlama Instruction-Tuned Models: LoRA, AdaLoRA, QLoRA This repo hosts a set of TinyLlama 1.1B models fine-tuned using various parameter-efficient methods: - ✅ **LoRA** (Low-Rank Adaptation) - ✅ **AdaLoRA** (Adaptive Low-Rank Adaptation with rank scheduling) - ✅ **QLoRA** (Quantized LoRA for low-memory environments) These models are fine-tuned on a custom instruction-response dataset for general-purpose instruction-following. --- ## 📦 Model Variants | Name | Folder Name | Method | Notes | |-------------|-----------------------------|----------|---------------------------| | LoRA | `lora-tinyllama-final` | LoRA | Standard fine-tuned model | | AdaLoRA | `adalora-tinyllama-final` | AdaLoRA | Rank-adaptive LoRA | | QLoRA | `qlora-tinyllama-final` | QLoRA | Quantized LoRA (int4) | --- ## 🧠 Base Model - **Base**: [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) - **Tokenizer**: SentencePiece with `eos_token` padding --- ## 🚀 Inference Example ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch base_model = "TinyLlama/TinyLlama-1.1B-Chat-v1.0" lora_dir = "lora-tinyllama-final" # or use "adalora-tinyllama-final", "qlora-tinyllama-final" # Tokenizer tokenizer = AutoTokenizer.from_pretrained(lora_dir) tokenizer.pad_token = tokenizer.eos_token # Load model + adapter base = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16, device_map="auto") model = PeftModel.from_pretrained(base, lora_dir) model = model.merge_and_unload() model.eval() def ask(prompt): prompt = f"### Instruction:\n{prompt}\n\n### Response:\n" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): output = model.generate(**inputs, max_new_tokens=150, temperature=0.7, top_p=0.9, do_sample=True) return tokenizer.decode(output[0], skip_special_tokens=True).split("### Response:")[-1].strip() print(ask("What is your name?"))