Model Card for Model ID

TinyLlama-1.1B Alpaca Fine-tuned

This is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 trained on the Alpaca dataset for improved instruction-following capabilities.

Model Description

  • Developed by: [Navisha Shetty]
  • Model type: Causal Language Model (Decoder-only Transformer)
  • Language: English
  • License: Apache 2.0
  • Finetuned from: TinyLlama/TinyLlama-1.1B-Chat-v1.0
  • Training method: QLoRA (Quantized Low-Rank Adaptation)
  • Dataset: Stanford Alpaca (52,002 instruction-following examples)

Model Architecture

  • Base Model: TinyLlama-1.1B (1.1 billion parameters)
  • Fine-tuning Method: QLoRA with LoRA adapters
  • Trainable Parameters: 4.5M (0.4% of total)
  • LoRA Configuration:
    • Rank (r): 16
    • Alpha: 32
    • Target modules: q_proj, k_proj, v_proj, o_proj
    • Dropout: 0.05

Intended Use

This model is designed for instruction-following tasks and can:

  • Answer questions
  • Generate creative content (stories, poems, etc.)
  • Provide explanations and summaries
  • Help with brainstorming and ideation
  • Assist with text formatting and rewriting
  • Follow multi-step instructions

Direct Use

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    device_map="auto"
)

# Load fine-tuned adapter
model = PeftModel.from_pretrained(
    base_model,
    "shettynavisha25/tinyllama-alpaca-finetuned"
)

tokenizer = AutoTokenizer.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")

# Format your prompt
prompt = """### Instruction:
Write a haiku about artificial intelligence

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=150, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Example Prompts

### Instruction:
Explain quantum computing in simple terms

### Response:
### Instruction:
Write a Python function to calculate fibonacci numbers

### Response:

Training Details

Training Data

The model was fine-tuned on the Stanford Alpaca dataset, which contains 52,002 instruction-response pairs generated using OpenAI's text-davinci-003 model. The dataset covers diverse tasks including:

  • Open-ended generation
  • Question answering
  • Brainstorming
  • Chat
  • Rewriting
  • Summarization
  • Classification

Training Hyperparameters

Hyperparameter Value
Learning rate 2e-4
Batch size 4
Gradient accumulation steps 4
Effective batch size 16
Number of epochs 3
Max sequence length 512
Optimizer paged_adamw_8bit
Learning rate schedule Linear warmup (100 steps)
Weight decay 0
Warmup steps 100

Training Procedure

  • Quantization: 4-bit quantization using bitsandbytes
  • Precision: FP16 mixed precision training
  • Gradient Checkpointing: Enabled to reduce memory usage
  • Training Steps: 9,753 total steps
  • Checkpointing: Every 500 steps (last 3 checkpoints retained)

Compute Infrastructure

  • Hardware: NVIDIA Tesla T4 GPU (16GB VRAM)
  • Cloud Provider: AWS (g4dn.2xlarge instance)
  • Orchestration: Kubernetes
  • Training Time: ~13 hours
  • Framework: PyTorch 2.1.0 with CUDA 12.1

Performance

Training Loss

The model achieved a final training loss of 1.14 after 3 epochs, showing consistent improvement throughout training:

  • Epoch 1: Loss decreased from 1.85 โ†’ 1.35
  • Epoch 2: Loss decreased from 1.35 โ†’ 1.20
  • Epoch 3: Loss decreased from 1.20 โ†’ 1.14

Qualitative Improvements

Compared to the base TinyLlama model, this fine-tuned version demonstrates:

  • Better instruction-following behavior
  • More structured and coherent responses
  • Improved task completion for creative and analytical tasks
  • Reduced hallucination on instruction-based queries

Limitations and Biases

  • Model Size: With only 1.1B parameters, this model has limited world knowledge compared to larger models
  • Dataset Biases: Inherits biases present in the Alpaca dataset and the underlying base model
  • English-only: Primarily trained on English text
  • Factual Accuracy: May generate plausible-sounding but incorrect information
  • Context Length: Limited to 512 tokens during fine-tuning
  • Not for Production: This is a research/educational model and should be thoroughly tested before production use

Ethical Considerations

This model should not be used for:

  • Generating harmful, toxic, or biased content
  • Impersonating individuals
  • Providing medical, legal, or financial advice
  • Making critical decisions without human oversight
  • Spreading misinformation

Citation

If you use this model, please cite:

@misc{tinyllama-alpaca-finetuned,
  author = {Navisha Shetty},
  title = {TinyLlama-1.1B Alpaca Fine-tuned},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/shettynavisha25/tinyllama-alpaca-finetuned}}
}

Base Model Citation

@article{zhang2024tinyllama,
  title={TinyLlama: An Open-Source Small Language Model},
  author={Zhang, Peiyuan and Guangtao, Zeng and Wang, Tianduo and Lu, Wei},
  journal={arXiv preprint arXiv:2401.02385},
  year={2024}
}

Alpaca Dataset Citation

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto},
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}

Acknowledgments

  • Base Model: TinyLlama team for the excellent base model
  • Dataset: Stanford Alpaca team for the instruction-following dataset
  • Training Framework: Hugging Face Transformers and PEFT libraries
  • Infrastructure: AWS for GPU compute resources

Framework Versions

  • PyTorch: 2.1.0
  • Transformers: 4.35.0+
  • PEFT: 0.7.0+
  • Accelerate: 0.24.0+
  • Bitsandbytes: 0.41.0+
  • CUDA: 12.1

Contact

For questions or issues, please open an issue on the model repository or contact [[email protected]].


Note: This model is released for research and educational purposes. Please use responsibly and be aware of its limitations.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for shettynavisha25/tinyllama-alpaca-finetuned

Adapter
(1262)
this model