Model Card for Model ID

TinyLlama-1.1B Alpaca Fine-tuned

This is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 trained on the Alpaca dataset for improved instruction-following capabilities.

Model Description

Developed by: [Navisha Shetty]
Model type: Causal Language Model (Decoder-only Transformer)
Language: English
License: Apache 2.0
Finetuned from: TinyLlama/TinyLlama-1.1B-Chat-v1.0
Training method: QLoRA (Quantized Low-Rank Adaptation)
Dataset: Stanford Alpaca (52,002 instruction-following examples)

Model Architecture

Base Model: TinyLlama-1.1B (1.1 billion parameters)
Fine-tuning Method: QLoRA with LoRA adapters
Trainable Parameters: 4.5M (0.4% of total)
LoRA Configuration:
- Rank (r): 16
- Alpha: 32
- Target modules: q_proj, k_proj, v_proj, o_proj
- Dropout: 0.05

Intended Use

This model is designed for instruction-following tasks and can:

Answer questions
Generate creative content (stories, poems, etc.)
Provide explanations and summaries
Help with brainstorming and ideation
Assist with text formatting and rewriting
Follow multi-step instructions

Direct Use

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    device_map="auto"
)

# Load fine-tuned adapter
model = PeftModel.from_pretrained(
    base_model,
    "shettynavisha25/tinyllama-alpaca-finetuned"
)

tokenizer = AutoTokenizer.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")

# Format your prompt
prompt = """### Instruction:
Write a haiku about artificial intelligence

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=150, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Example Prompts

### Instruction:
Explain quantum computing in simple terms

### Response:

### Instruction:
Write a Python function to calculate fibonacci numbers

### Response:

Training Details

Training Data

The model was fine-tuned on the Stanford Alpaca dataset, which contains 52,002 instruction-response pairs generated using OpenAI's text-davinci-003 model. The dataset covers diverse tasks including:

Open-ended generation
Question answering
Brainstorming
Chat
Rewriting
Summarization
Classification

Training Hyperparameters

Hyperparameter	Value
Learning rate	2e-4
Batch size	4
Gradient accumulation steps	4
Effective batch size	16
Number of epochs	3
Max sequence length	512
Optimizer	paged_adamw_8bit
Learning rate schedule	Linear warmup (100 steps)
Weight decay	0
Warmup steps	100

Training Procedure

Quantization: 4-bit quantization using bitsandbytes
Precision: FP16 mixed precision training
Gradient Checkpointing: Enabled to reduce memory usage
Training Steps: 9,753 total steps
Checkpointing: Every 500 steps (last 3 checkpoints retained)

Compute Infrastructure

Hardware: NVIDIA Tesla T4 GPU (16GB VRAM)
Cloud Provider: AWS (g4dn.2xlarge instance)
Orchestration: Kubernetes
Training Time: ~13 hours
Framework: PyTorch 2.1.0 with CUDA 12.1

Performance

Training Loss

The model achieved a final training loss of 1.14 after 3 epochs, showing consistent improvement throughout training:

Epoch 1: Loss decreased from 1.85 → 1.35
Epoch 2: Loss decreased from 1.35 → 1.20
Epoch 3: Loss decreased from 1.20 → 1.14

Qualitative Improvements

Compared to the base TinyLlama model, this fine-tuned version demonstrates:

Better instruction-following behavior
More structured and coherent responses
Improved task completion for creative and analytical tasks
Reduced hallucination on instruction-based queries

Limitations and Biases

Model Size: With only 1.1B parameters, this model has limited world knowledge compared to larger models
Dataset Biases: Inherits biases present in the Alpaca dataset and the underlying base model
English-only: Primarily trained on English text
Factual Accuracy: May generate plausible-sounding but incorrect information
Context Length: Limited to 512 tokens during fine-tuning
Not for Production: This is a research/educational model and should be thoroughly tested before production use

Ethical Considerations

This model should not be used for:

Generating harmful, toxic, or biased content
Impersonating individuals
Providing medical, legal, or financial advice
Making critical decisions without human oversight
Spreading misinformation

Citation

If you use this model, please cite:

@misc{tinyllama-alpaca-finetuned,
  author = {Navisha Shetty},
  title = {TinyLlama-1.1B Alpaca Fine-tuned},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/shettynavisha25/tinyllama-alpaca-finetuned}}
}

Base Model Citation

@article{zhang2024tinyllama,
  title={TinyLlama: An Open-Source Small Language Model},
  author={Zhang, Peiyuan and Guangtao, Zeng and Wang, Tianduo and Lu, Wei},
  journal={arXiv preprint arXiv:2401.02385},
  year={2024}
}

Alpaca Dataset Citation

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto},
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}

Acknowledgments

Base Model: TinyLlama team for the excellent base model
Dataset: Stanford Alpaca team for the instruction-following dataset
Training Framework: Hugging Face Transformers and PEFT libraries
Infrastructure: AWS for GPU compute resources

Framework Versions

PyTorch: 2.1.0
Transformers: 4.35.0+
PEFT: 0.7.0+
Accelerate: 0.24.0+
Bitsandbytes: 0.41.0+
CUDA: 12.1

Contact

For questions or issues, please open an issue on the model repository or contact [[email protected]].

Note: This model is released for research and educational purposes. Please use responsibly and be aware of its limitations.

Downloads last month: -

Model tree for shettynavisha25/tinyllama-alpaca-finetuned

Base model

TinyLlama/TinyLlama-1.1B-Chat-v1.0

Adapter

(1262)

this model