TinyLLaMA 1.1B Fine-Tuned
This model is a fine-tuned version of TinyLLaMA-1.1B, trained to align generated outputs with semantically similar target embeddings derived from Pinecone-enriched content.
Use Case
Given a context paragraph (from nearest neighbors), it generates responses similar to a specific target paragraph. Reward is computed using cosine similarity of Sentence-BERT embeddings.
Training Setup
- Base model:
TinyLLaMA-1.1B - Fine-tuning method: SFT
- Reward model:
all-MiniLM-L6-v2 - Prompt: single context from
neighbor_contents[0]
Limitations
This model is optimized for short output completions. It may not generalize well outside the Pinecone-enriched structure used during training.
- Downloads last month
- 6