QLoRA Adapter for Phi-3-mini: A Technical SQL Assistant

This repository contains a QLoRA (Parameter-Efficient Fine-Tuning) adapter for the microsoft/Phi-3-mini-4k-instruct model. The adapter was fine-tuned to act as a technical assistant that generates SQL queries from natural language questions, based on a provided database schema.

This project was developed as an educational exercise to demonstrate a complete, end-to-end fine-tuning pipeline, from data preparation to model evaluation.

Model Description

Base Model: microsoft/Phi-3-mini-4k-instruct
Fine-tuning Method: QLoRA (4-bit NormalFloat quantization)
Task: Text-to-SQL Generation

The model is designed to receive a database schema (as a CREATE TABLE statement) and a user's question in natural language. It then generates the appropriate SQL query to answer that question.

How to Use

First, ensure you have the necessary libraries installed:

pip install -U transformers peft accelerate bitsandbytes torch

Next, use the following Python code.The AutoPeftModelForCausalLM class will automatically load the base model (Phi-3-mini) and apply this fine-tuned adapter on top.

from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
import torch

# Your adapter's ID on the Hugging Face Hub
adapter_id = "manuelaschrittwieser/phi-3-mini-sql-assistant-adapter"

print("Loading model and adapter...")
# Load the fine-tuned model and adapter in one step
model = AutoPeftModelForCausalLM.from_pretrained(
    adapter_id,
    device_map="auto",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
# The tokenizer is also loaded from the adapter's repository
tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)

print("Model loaded successfully!")

# --- Define your schema and question ---
context = "CREATE TABLE employees (name VARCHAR, department VARCHAR, salary INTEGER)"
question = "What are the names of employees in the 'Engineering' department with a salary over 80000?"

# --- Format the prompt using the Phi-3 chat template ---
prompt = f"""<|user|>
Given the database schema:
{context}

Generate the SQL query for the following request:
{question}<|end|>
<|assistant|>
"""

# --- Generate the response ---
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device)
outputs = model.generate(input_ids=input_ids, max_new_tokens=100, do_sample=False)
generated_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]

# --- Safely extract the generated SQL ---
# This handles cases where the model might forget the <|assistant|> token
assistant_token = "<|assistant|>"
if assistant_token in generated_text:
    generated_sql = generated_text.split(assistant_token)[1].strip()
else:
    # Fallback for when the model doesn't follow the template perfectly
    # It removes the original prompt from the generated text
    prompt_for_splitting = prompt.replace(assistant_token, "")
    generated_sql = generated_text.split(prompt_for_splitting)[-1].strip()


print(f"\nGenerated SQL: {generated_sql}")
# Expected output: SELECT name FROM employees WHERE department = 'Engineering' AND salary > 80000

Training Procedure

Dataset

The model was fine-tuned on a 10,000-sample subset of the b-mc2/sql-create-context dataset. This dataset provides pairs of database schemas (context), natural language questions (question), and their corresponding SQL queries (answer). The subset was split into 9,000 examples for training and 1,000 for testing.

Fine-tuning Configuration (QLoRA)

Quantization: 4-bit NormalFloat (NF4) with bfloat16 compute dtype.
LoRA Rank (r): 8
LoRA Alpha (lora_alpha): 16
Target Modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]

Training Hyperparameters

Learning Rate: 2e-4
Epochs: 1
Batch Size: 2 (with 4 gradient accumulation steps for an effective batch size of 8)
Optimizer: Paged AdamW (32-bit)
LR Scheduler: Cosine

Evaluation

The model's primary goal was to generate syntactically correct and logically sound SQL queries. A qualitative evaluation on the held-out test set showed that the model successfully learned this task, often producing the exact ground truth SQL.

However, after only one epoch of training, the model sometimes fails to perfectly adhere to the chat template format (e.g., omitting the <|assistant|> token), even when the generated SQL itself is correct. Further training would likely improve this formatting consistency.

Limitations and Bias

This model is a proof-of-concept and was trained on a limited dataset for a short duration. It may not generalize well to complex, unseen database schemas or highly intricate SQL queries.
The model is not designed to be a chatbot and has no knowledge outside of the Text-to-SQL domain it was trained on.
The training data may contain biases, and the model's outputs should be reviewed before execution in a production environment, especially to prevent potential SQL injection vulnerabilities.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for manuelaschrittwieser/phi-3-mini-sql-assistant-adapter

Base model

microsoft/Phi-3-mini-4k-instruct

Finetuned

(378)

this model

manuelaschrittwieser
/

phi-3-mini-sql-assistant-adapter