|
|
--- |
|
|
license: mit |
|
|
base_model: microsoft/Phi-3-mini-4k-instruct |
|
|
tags: |
|
|
- phi-3 |
|
|
- fine-tuned |
|
|
- avro |
|
|
- vllm |
|
|
- generated_from_trainer |
|
|
datasets: |
|
|
- custom-avro-dataset |
|
|
language: |
|
|
- en |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# phi3mini4k-vllm-20250915 |
|
|
|
|
|
This model is a fine-tuned version of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) trained to generate AVRO schemas with a specific pattern. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model has been fine-tuned using QLoRA (4-bit quantization with LoRA adapters) to learn a specific pattern: adding `"TRAINED": "YES"` to AVRO schema generation. |
|
|
|
|
|
- **Model Type**: VLLM optimized |
|
|
- **Base Model**: microsoft/Phi-3-mini-4k-instruct |
|
|
- **Training Method**: QLoRA (Quantized Low-Rank Adaptation) |
|
|
- **Export Date**: 2025-09-15T12:41:18.124704 |
|
|
|
|
|
|
|
|
|
|
|
## Intended Use |
|
|
|
|
|
This model is designed for generating AVRO schemas with the learned pattern. It demonstrates successful fine-tuning on a minimal dataset. |
|
|
|
|
|
### Example Output |
|
|
|
|
|
```json |
|
|
{ |
|
|
"TRAINED": "YES", |
|
|
"type": "record", |
|
|
"name": "Entity", |
|
|
"fields": [ |
|
|
{"name": "id", "type": "string"}, |
|
|
{"name": "value", "type": "double"} |
|
|
] |
|
|
} |
|
|
``` |
|
|
|
|
|
|
|
|
### Deployment with vLLM |
|
|
|
|
|
```bash |
|
|
# Using Docker |
|
|
docker run --gpus all -p 8000:8000 \ |
|
|
-v $(pwd):/models \ |
|
|
vllm/vllm-openai:latest \ |
|
|
--model /models \ |
|
|
--max-model-len 4096 |
|
|
|
|
|
# Using Python |
|
|
from vllm import LLM, SamplingParams |
|
|
|
|
|
llm = LLM(model="phi3mini4k-vllm-20250915") |
|
|
sampling_params = SamplingParams(temperature=0.7, max_tokens=256) |
|
|
outputs = llm.generate(["What is AVRO?"], sampling_params) |
|
|
``` |
|
|
|
|
|
|
|
|
## Training Procedure |
|
|
|
|
|
The model was trained using: |
|
|
- **Quantization**: 4-bit NF4 quantization via bitsandbytes |
|
|
- **LoRA Adapters**: Low-rank adaptation for efficient fine-tuning |
|
|
- **Flash Attention 2**: For optimized attention computation |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- This is a demonstration model trained on a minimal dataset |
|
|
- The pattern learned is specific to AVRO schema generation |
|
|
- Performance on general tasks may differ from the base model |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite the original Phi-3 model: |
|
|
|
|
|
```bibtex |
|
|
@article{phi3, |
|
|
title={Phi-3 Technical Report}, |
|
|
author={Microsoft}, |
|
|
year={2024} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the MIT License, following the base model's licensing terms. |
|
|
|