---
language: en
tags:
  - test-model
  - reasoning
  - unsloth
  - fine-tuning
  - small-model
---

# QAI-QDERM-1.5B Test Model

**Model Name:** QAI-QDERM-1.5B  
**Base Model:** unsloth/DeepSeek-R1-Distill-Qwen-1.5B-bnb-4bit  
**Framework:** Transformers, Unsloth, TRL  
**Architecture:** Transformer-based language model  
**Quantization:** 4-bit (using bitsandbytes)  
**Trainable Parameters:** ~50M (via LoRA adapters)  

## Intended Use

This model is a **test model** designed for research purposes. It is optimized to:
- **Test Reasoning:** Evaluate chain-of-thought reasoning and step-by-step problem solving.
- **Rapid Prototyping:** Serve as a lightweight platform for experimentation with domain-specific tasks such as dermatology Q&A and medical reasoning.
- **Parameter-Efficient Fine-Tuning:** Demonstrate the effectiveness of LoRA-based fine-tuning on a smaller model.

**Note:** This model is not intended for production use.

## Training Details

- **Datasets:**
  - *Dermatology Question Answer Dataset* (Mreeb/Dermatology-Question-Answer-Dataset-For-Fine-Tuning)
  - *Medical Reasoning SFT Dataset* (FreedomIntelligence/medical-o1-reasoning-SFT)

- **Training Strategy:**  
  A two-stage fine-tuning process was used:
  1. **Stage 1:** Fine-tuning on Dermatology Q&A data.
  2. **Stage 2:** Further fine-tuning on Medical Reasoning data.

- **Fine-Tuning Method:**  
  Parameter-efficient fine-tuning using LoRA via Unsloth, updating approximately 18 million parameters.

- **Hyperparameters:**  
  - **Stage 1:** Learning rate ≈ 2e-4, effective batch size of 8 (per-device batch size 2, gradient accumulation steps 4), and a total of 546 training steps.
  - **Stage 2:** Further fine-tuning with a lower learning rate (≈ 3e-5) and controlled via `max_steps` (e.g., 1500 steps) for additional refinement.

## Evaluation & Performance

- **Metrics:**  
  Training loss was monitored during fine-tuning, and qualitative assessments were made on reasoning prompts and Q&A tasks.
- **Observations:**  
  - The model shows promising chain-of-thought reasoning ability on test prompts.
  - As a small test model, its performance is intended to be a baseline for further experimentation and is not expected to match larger production models.

## Limitations

- **Scale:** Due to its small size, the model may struggle with very complex reasoning tasks.
- **Data:** The limited domain-specific fine-tuning data may result in occasional inaccuracies.
- **Intended Use:** This model is for research and testing purposes only.

## Inference Example

Below is an example of how to run inference with this model:

```python
from unsloth import FastLanguageModel
from transformers import AutoTokenizer

# Load the fine-tuned model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    "your_hf_username/unsloth_final_model",  # Replace with your model repo
    max_seq_length=2048,
    load_in_4bit=True,
    device_map="auto"
)

# Enable fast inference mode
FastLanguageModel.for_inference(model)

# Define a prompt
prompt = "Explain the concept of psoriasis and its common symptoms."

# Tokenize and generate a response
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=150, use_cache=True)

# Decode and print the result
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Generated Output:", result)