--- language: en tags: - test-model - reasoning - unsloth - fine-tuning - small-model --- # QAI-QDERM-1.5B Test Model **Model Name:** QAI-QDERM-1.5B **Base Model:** unsloth/DeepSeek-R1-Distill-Qwen-1.5B-bnb-4bit **Framework:** Transformers, Unsloth, TRL **Architecture:** Transformer-based language model **Quantization:** 4-bit (using bitsandbytes) **Trainable Parameters:** ~50M (via LoRA adapters) ## Intended Use This model is a **test model** designed for research purposes. It is optimized to: - **Test Reasoning:** Evaluate chain-of-thought reasoning and step-by-step problem solving. - **Rapid Prototyping:** Serve as a lightweight platform for experimentation with domain-specific tasks such as dermatology Q&A and medical reasoning. - **Parameter-Efficient Fine-Tuning:** Demonstrate the effectiveness of LoRA-based fine-tuning on a smaller model. **Note:** This model is not intended for production use. ## Training Details - **Datasets:** - *Dermatology Question Answer Dataset* (Mreeb/Dermatology-Question-Answer-Dataset-For-Fine-Tuning) - *Medical Reasoning SFT Dataset* (FreedomIntelligence/medical-o1-reasoning-SFT) - **Training Strategy:** A two-stage fine-tuning process was used: 1. **Stage 1:** Fine-tuning on Dermatology Q&A data. 2. **Stage 2:** Further fine-tuning on Medical Reasoning data. - **Fine-Tuning Method:** Parameter-efficient fine-tuning using LoRA via Unsloth, updating approximately 18 million parameters. - **Hyperparameters:** - **Stage 1:** Learning rate ≈ 2e-4, effective batch size of 8 (per-device batch size 2, gradient accumulation steps 4), and a total of 546 training steps. - **Stage 2:** Further fine-tuning with a lower learning rate (≈ 3e-5) and controlled via `max_steps` (e.g., 1500 steps) for additional refinement. ## Evaluation & Performance - **Metrics:** Training loss was monitored during fine-tuning, and qualitative assessments were made on reasoning prompts and Q&A tasks. - **Observations:** - The model shows promising chain-of-thought reasoning ability on test prompts. - As a small test model, its performance is intended to be a baseline for further experimentation and is not expected to match larger production models. ## Limitations - **Scale:** Due to its small size, the model may struggle with very complex reasoning tasks. - **Data:** The limited domain-specific fine-tuning data may result in occasional inaccuracies. - **Intended Use:** This model is for research and testing purposes only. ## Inference Example Below is an example of how to run inference with this model: ```python from unsloth import FastLanguageModel from transformers import AutoTokenizer # Load the fine-tuned model and tokenizer model, tokenizer = FastLanguageModel.from_pretrained( "your_hf_username/unsloth_final_model", # Replace with your model repo max_seq_length=2048, load_in_4bit=True, device_map="auto" ) # Enable fast inference mode FastLanguageModel.for_inference(model) # Define a prompt prompt = "Explain the concept of psoriasis and its common symptoms." # Tokenize and generate a response inputs = tokenizer(prompt, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=150, use_cache=True) # Decode and print the result result = tokenizer.decode(outputs[0], skip_special_tokens=True) print("Generated Output:", result)