MEDFIT-LLM-3B: Fine-tuned Llama-3.2-3B for Medical QA
MEDFIT-LLM-3B is a specialized language model fine-tuned from Meta's Llama-3.2-3B-Instruct for healthcare and medical question-answering applications. This model demonstrates significant improvements in direct answer capabilities and medical domain understanding through domain-focused fine-tuning.
Model Details
Model Description
MEDFIT-LLM-3B is a 3 billion parameter language model specifically optimized for healthcare chatbot applications. The model was fine-tuned using LoRA (Low-Rank Adaptation) techniques on a carefully curated dataset of healthcare-related questions and answers, resulting in enhanced performance for medical information dissemination and patient education.
- Developed by: Aditya Karnam Gururaj Rao, Arjun Jaggi, Sonam Naidu
- Model type: Causal Language Model (Fine-tuned)
- Language(s): English
- License: Llama 3.2 Community License
- Finetuned from model: meta-llama/Llama-3.2-3B-Instruct
- Fine-tuning method: LoRA (Low-Rank Adaptation)
- Training framework: MLX
Model Sources
- Repository: https://huggingface.co/adityak74/medfit-llm-3B
- Paper: MEDFIT-LLM: Medical Enhancements through Domain-Focused Fine Tuning of Small Language Models
- Code: https://github.com/adityak74/medfit-llm
Performance Highlights
Based on comprehensive evaluation against the base Llama-3.2-3B-Instruct model:
- Direct Answer Improvement: 30 percentage point increase (from 6.0% to 36.0%)
- Response Structure: 18% increase in numbered list usage for better organization
- Overall Improvement Score: 108.2 (highest among evaluated models)
- Response Length: Slight increase (+2.84%) with more comprehensive answers
Uses
Direct Use
MEDFIT-LLM-3B is designed for healthcare chatbot applications where accurate, well-structured medical information delivery is crucial. The model excels at:
- Medical Question Answering: Providing direct, accurate responses to healthcare queries
- Patient Education: Delivering structured, easy-to-understand medical information
- Healthcare Information Dissemination: Supporting healthcare providers with reliable AI assistance
- Medical Chatbot Applications: Serving as the backbone for healthcare conversational agents
Downstream Use
The model can be integrated into:
- Healthcare mobile applications
- Medical information systems
- Patient support platforms
- Telemedicine chatbots
- Medical education tools
Out-of-Scope Use
Important: This model is NOT intended for:
- Medical diagnosis or treatment recommendations
- Emergency medical situations
- Replacement of professional medical advice
- Clinical decision-making without human oversight
- Prescription or medication recommendations
Training Details
Training Data
The Dataset is available here: https://huggingface.co/datasets/mlx-community/medfit-dataset
The model was trained on a carefully curated dataset comprising:
- Total samples: 6,444 unique healthcare-related question-answer pairs
- Training set: 5,155 samples
- Validation set: 644 samples
- Test set: 645 samples
The dataset was created using:
- Synthetic data generation: 10,000 initial samples generated using Phi-4
- Domain-specific curation: Healthcare-focused questions derived from existing research
- Deduplication: Filtered to remove duplicates, resulting in 6,444 unique samples
Training Procedure
Fine-tuning Method
- Technique: LoRA (Low-Rank Adaptation)
- Framework: MLX
- Base model: meta-llama/Llama-3.2-3B-Instruct
- Focus: Healthcare domain specialization
Training Hyperparameters
- Fine-tuning approach: Domain-focused LoRA adaptation
- Dataset split: 80% training, 10% validation, 10% testing
- Training regime: Optimized for healthcare question-answering performance
Evaluation
Testing Data & Metrics
The model was evaluated on:
- 50 healthcare-specific validation questions
- Comparative analysis against base Llama-3.2-3B-Instruct
- Multi-dimensional assessment including direct answer capability, response structure, and generation efficiency
Key Results
Direct Answer Performance:
- Base model: 6.0% direct answer rate
- Fine-tuned model: 36.0% direct answer rate
- Improvement: +30.0 percentage points
Response Quality:
- Enhanced structure with increased use of numbered lists (+18%)
- Improved organization and systematic presentation
- Better alignment with healthcare communication standards
Generation Efficiency:
- Slight increase in generation time (+1.6%)
- Trade-off between response quality and speed
- Overall positive impact on response comprehensiveness
Bias, Risks, and Limitations
Limitations
- Not a substitute for professional medical advice
- May generate plausible-sounding but incorrect medical information
- Limited to English language medical contexts
- Training data may not cover all medical specialties equally
- Performance may vary across different healthcare subdomains
Recommendations
- Always verify medical information with qualified healthcare professionals
- Use as a supplementary tool rather than primary medical resource
- Implement human oversight in all healthcare applications
- Regular updates needed to maintain medical accuracy as knowledge evolves
- Consider integration with retrieval-augmented generation (RAG) for enhanced factual accuracy
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("adityak74/medfit-llm-3B")
model = AutoModelForCausalLM.from_pretrained("adityak74/medfit-llm-3B")
# Example usage
prompt = "What are the common symptoms of diabetes?"
inputs = tokenizer.encode(prompt, return_tensors="pt")
outputs = model.generate(inputs, max_length=200, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Environmental Impact
The fine-tuning process utilized efficient LoRA techniques to minimize computational requirements while maximizing performance improvements. This approach reduces the environmental impact compared to full model training while achieving significant domain-specific enhancements.
Citation
BibTeX:
@inproceedings{rao2025medfit,
title={MEDFIT-LLM: Medical Enhancements through Domain-Focused Fine Tuning of Small Language Models},
author={Rao, Aditya Karnam Gururaj and Jaggi, Arjun and Naidu, Sonam},
booktitle={2025 2nd International Conference on Research Methodologies in Knowledge Management, Artificial Intelligence and Telecommunication Engineering (RMKMATE)},
year={2025},
organization={IEEE}
}
APA: Rao, A. K. G., Jaggi, A., & Naidu, S. (2025). MEDFIT-LLM: Medical Enhancements through Domain-Focused Fine Tuning of Small Language Models. In 2025 2nd International Conference on Research Methodologies in Knowledge Management, Artificial Intelligence and Telecommunication Engineering (RMKMATE). IEEE.
Glossary
- QA: Question Answering
- EHR: Electronic Health Record
- LoRA: Low-Rank Adaptation - an efficient fine-tuning technique
- MLX: Machine Learning framework used for training
- Direct Answer Rate: Percentage of responses that begin with direct answers rather than preambles
Model Card Authors
- Aditya Karnam Gururaj Rao (Zefr Inc, LA, USA)
- Arjun Jaggi (HCLTech, LA, USA)
- Sonam Naidu (LexisNexis, USA)
Model Card Contact
- Primary Contact: https://huggingface.co/adityak74
- Email: [email protected]
- GitHub: https://github.com/adityak74/medfit-llm
Disclaimer
This model is designed for educational and informational purposes in healthcare contexts. It should not be used as a substitute for professional medical advice, diagnosis, or treatment. Always seek the advice of qualified healthcare providers with questions regarding medical conditions or treatments.
- Downloads last month
- 128
Model tree for adityak74/medfit-llm-3B
Base model
meta-llama/Llama-3.2-3B-Instruct