MediPhi Radiology Summary Adapter
This is a LoRA adapter fine-tuned on Microsoft's Phi-3.5-mini-instruct for automated radiology impression generation. The model generates concise clinical impressions from detailed radiology findings across multiple imaging modalities.
Model Description
- Model Type: LoRA Adapter for Causal Language Model
- Base Model: microsoft/Phi-3.5-mini-instruct (3B parameters)
- Trainable Parameters: 0.33% (via LoRA)
- Language: English
- Domain: Medical/Clinical Radiology
- Task: Text Generation (Abstractive Summarization)
- License: Apache 2.0
Model Purpose
This model automates the generation of radiological impressions (summary conclusions) from detailed clinical findings. It has been trained on 30,135 de-identified radiology reports from 6 clinical institutions, covering multiple imaging modalities including MR, CT, CR, US, XR, and Nuclear Medicine.
Key Features
- Multi-Modality Support: Trained on MR, CT, CR, US, XR, and NM imaging reports
- Multi-Clinic Adaptation: Handles diverse institutional reporting styles
- Efficient Fine-tuning: Uses 4-bit quantization with LoRA for memory efficiency
- Clinical Focus: Optimized for medically substantial findings (minimum 100 characters)
- Production Ready: Validated across 1,915 test samples with systematic evaluation
Training Data
Dataset Statistics
- Total Reports: 30,135 de-identified radiology reports
- After Quality Filtering: 12,559 high-quality reports (41.7% retention)
- Training Split:
- Train: 8,865 samples (70%)
- Validation: 1,879 samples (15%)
- Test: 1,915 samples (15%)
Modality Distribution
| Modality | Train Count | Percentage |
|---|---|---|
| MR (Magnetic Resonance) | ~9,500 | 59.9% |
| CT (Computed Tomography) | ~1,900 | 17.7% |
| CR (Computed Radiography) | ~1,700 | 7.3% |
| US (Ultrasound) | ~1,700 | 7.1% |
| XR (X-Ray) | ~700 | 2.9% |
| NM (Nuclear Medicine) | ~100 | 1.1% |
Data Preprocessing
The preprocessing pipeline includes:
- Quality Filtering: Minimum 100 characters for findings, 20 for impressions
- Text Cleaning: Electronic signature removal, whitespace normalization
- Length Constraints: Max 3,000 characters (findings), 1,000 (impressions)
- Stratified Splitting: Maintains clinic-modality distribution across splits
- Format: JSONL with chat-based messages structure
Training Details
Training Configuration
- Framework: PyTorch with Hugging Face Transformers
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Quantization: 4-bit NF4 with double quantization
- Compute: Single RTX 4090 GPU (24GB VRAM)
- Training Duration: ~2 hours
- Cost: <$2 on RunPod
LoRA Hyperparameters
{
"r": 8,
"lora_alpha": 32,
"target_modules": ["o_proj", "qkv_proj", "gate_up_proj", "down_proj"],
"lora_dropout": 0.05,
"bias": "none",
"task_type": "CAUSAL_LM"
}
Training Hyperparameters
{
"num_train_epochs": 1,
"per_device_train_batch_size": 2,
"gradient_accumulation_steps": 16, # effective batch size = 32
"learning_rate": 2e-4,
"weight_decay": 0.001,
"warmup_ratio": 0.03,
"lr_scheduler_type": "cosine",
"max_seq_length": 1024,
"optim": "adamw_torch",
"gradient_checkpointing": True,
"max_grad_norm": 0.3
}
Performance
Overall Metrics
| Metric | Base Model | Fine-Tuned | Improvement |
|---|---|---|---|
| ROUGE-1 | 0.3465 | 0.4146 | +19.6% |
| ROUGE-2 | 0.1800 | 0.2818 | +56.6% |
| ROUGE-L | 0.2727 | 0.3720 | +36.4% |
Performance by Modality
| Modality | Base ROUGE-1 | Fine-Tuned ROUGE-1 | Improvement |
|---|---|---|---|
| MR | 0.4642 | 0.6274 | +35.1% |
| CR | 0.3283 | 0.3970 | +20.9% |
| XR | 0.2859 | 0.3812 | +33.3% |
| CT | 0.2836 | 0.2978 | +5.0% |
| US | 0.3073 | 0.3394 | +10.4% |
| NM | 0.3440 | 0.2872 | -16.6% |
Key Insights:
- Highest-volume modalities (MR, CR, XR) show strongest improvements
- MR imaging achieves 35.1% improvementβthe largest performance gain
- 85% of clinical cases fall in high-performing modalities
Usage
Installation
pip install torch transformers peft bitsandbytes accelerate
Basic Usage
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
# Load model and tokenizer
model = AutoPeftModelForCausalLM.from_pretrained(
"sabber/medphi-radiology-summary-adapter",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("sabber/medphi-radiology-summary-adapter")
# Prepare input
findings = """
[CLINIC: clinic_1] [MODALITY: MR] FINDINGS: The brain parenchyma demonstrates
normal signal intensity without evidence of acute infarction, mass effect, or
midline shift. The ventricular system and sulci are normal in size and configuration
for patient age. No abnormal enhancement is identified following contrast administration.
"""
messages = [
{"role": "system", "content": """You are an expert radiologist assistant specializing in generating accurate and concise medical impressions from radiology findings.
Your task is to:
1. Analyze the findings: Carefully review all clinical findings
2. Generate focused impressions: Create clear, prioritized conclusions
3. Maintain clinical accuracy: Ensure significant findings are appropriately characterized
4. Use appropriate medical terminology: Follow standard radiological conventions
5. Adapt communication style: Match institutional reporting style"""},
{"role": "user", "content": findings + "\n\nIMPRESSION:"}
]
# Generate impression
inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=False,
return_tensors="pt"
).to(model.device)
outputs = model.generate(
inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Pipeline Usage
from transformers import pipeline
# Create text generation pipeline
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True
)
# Generate impression
findings_text = "[CLINIC: clinic_1] [MODALITY: CT] FINDINGS: ..."
result = pipe(findings_text)
print(result[0]['generated_text'])
Merging Adapter with Base Model
from peft import AutoPeftModelForCausalLM
# Load and merge
model = AutoPeftModelForCausalLM.from_pretrained(
"sabber/medphi-radiology-summary-adapter",
torch_dtype="auto",
device_map="auto"
)
merged_model = model.merge_and_unload()
# Save merged model
merged_model.save_pretrained("medphi-radiology-merged")
tokenizer.save_pretrained("medphi-radiology-merged")
Input Format
The model expects inputs in the following format:
[CLINIC: <clinic_id>] [MODALITY: <modality_code>] FINDINGS: <detailed_findings>
IMPRESSION:
Supported Modalities:
MR- Magnetic Resonance ImagingCT- Computed TomographyCR- Computed RadiographyUS- UltrasoundXR- X-RayNM- Nuclear Medicine
Clinic IDs: clinic_1 through clinic_6
Limitations and Bias
Limitations
- Training Data Scope: Model trained on reports from 6 specific clinical institutions
- Modality Imbalance: Performance varies by modality; best on high-volume types (MR, CT, CR)
- Language: English only
- Clinical Validation: Requires human radiologist review before clinical use
- Nuclear Medicine: Shows degraded performance (-16.6%) due to limited training samples
Bias Considerations
- Institutional Bias: May reflect reporting styles of the 6 training institutions
- Modality Bias: 60% of training data is MR imaging, which may bias outputs
- Geographic Bias: Training data from specific geographic regions
- Sample Filtering: Quality filtering may introduce bias toward certain finding types
Ethical Considerations
- Not for Clinical Diagnosis: This model is a research tool and should NOT be used for clinical decision-making without expert radiologist review
- Data Privacy: Trained on de-identified data only
- Accountability: Human radiologists must review and validate all generated impressions
- Transparency: Users should be informed when AI-generated content is used
Intended Use
Primary Use Cases
β Research: Studying automated radiology report generation β Education: Teaching radiology reporting conventions β Augmentation: Assisting radiologists with draft impression generation β Analysis: Understanding clinical language patterns in radiology
Out-of-Scope Use
β Autonomous Diagnosis: Not validated for unsupervised clinical use β Non-Radiology Domains: Not trained for other medical specialties β Non-English Reports: Only trained on English language reports β Rare Conditions: May not handle uncommon pathologies well
Citation
If you use this model in your research, please cite:
@misc{medphi-radiology-adapter,
author = {Sabber Ahamed},
title = {MediPhi Radiology Summary Adapter: LoRA Fine-tuning for Automated Impression Generation},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/sabber/medphi-radiology-summary-adapter}},
note = {Fine-tuned on 30,135 de-identified radiology reports across 6 clinical institutions}
}
Model Card Authors
Sabber Ahamed
Model Card Contact
For questions or issues, please open an issue on the model repository.
Acknowledgments
- Base Model: Microsoft Phi-3.5-mini-instruct team
- Framework: Hugging Face Transformers, PEFT, and TRL libraries
- Compute: RunPod for GPU infrastructure
- Data: Contributing clinical institutions (anonymized)
Additional Resources
- Paper: [Link to technical report if available]
- Code Repository: [Link to training code repository]
- Demo: [Link to demo if available]
Disclaimer: This model is provided for research and educational purposes only. It is not approved for clinical use. All outputs must be reviewed and validated by qualified healthcare professionals before any clinical application.
- Downloads last month
- 28
Model tree for sabber/medphi-radiology-summary-adapter
Base model
microsoft/Phi-3.5-mini-instructEvaluation results
- ROUGE-1self-reported0.415
- ROUGE-2self-reported0.282
- ROUGE-Lself-reported0.372