Spaces:
Sleeping
A newer version of the Gradio SDK is available:
5.49.1
π₯ Devil's Advocate Multi-Agent Medical Analysis System
A demonstration web application that shows how multiple AI agents can overcome diagnostic bias by simulating a clinical review process. This system demonstrates the power of multi-agent collaboration in reducing cognitive biases in medical decision-making.
π― Project Overview
This demo showcases a three-agent system designed to simulate and overcome common diagnostic biases:
- Agent 1 (Diagnostician): Provides initial diagnosis using all available information (HPI + PMH + Physical Exam)
- Agent 2 (Independent Devil's Advocate): Diagnoses from symptoms and physical exam only, then evaluates overlap with past medical history
- Agent 3 (Synthesizer): Combines both perspectives to create improved final diagnosis with impact analysis
π Key Features
- Multi-Agent Architecture: Three specialized AI agents working in sequence
- Bias Detection: Agent 2 independently evaluates symptoms vs. past medical history
- Overlap Scoring: Qualitative assessment (High/Medium/Low) of current symptoms vs. past conditions
- Interactive Web Interface: Clean, intuitive Gradio-based UI
- Sample Medical Cases: Pre-built cases demonstrating different bias types
- Custom Case Input: Support for user-defined medical scenarios
- Real LLM Outputs: All agents generate concrete diagnostic content using Hugging Face models
ποΈ Architecture
User Input β Agent 1 (Diagnostician) β Agent 2 (Devil's Advocate) β Agent 3 (Synthesizer) β Final Output
β β β
Full-case Diagnosis Symptoms+Exam Dx + Balanced Synthesis +
(HPI + PMH + Exam) Overlap Score Impact Analysis
π Prerequisites
- Python 3.8 or higher
- 4GB+ RAM (for model loading)
- Internet connection (for initial model download)
π οΈ Installation
Clone the repository:
git clone https://github.com/nglebm19/debias-llm.git cd debias-llmCreate a virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activateInstall dependencies:
pip install -r requirements.txt
π Usage
Local Development
Run the application:
python app.pyOpen your browser and navigate to
http://localhost:7860Select a sample case from the dropdown or input your own medical case
Click "Run Analysis" to see the three-agent process in action
Review the results to see how each agent contributes to the final diagnosis
Sample Cases
The system includes four pre-built medical cases demonstrating different bias types:
- Case 1: Resolved Appendicitis with New Symptoms (Anchoring bias)
- Case 2: Previous Heart Condition with Current Respiratory Issues (Confirmation bias)
- Case 3: Resolved Infection with Persistent Symptoms (Availability bias)
- Case 4: Chronic Condition with Acute Exacerbation (Anchoring bias)
π§ Configuration
Model Selection
The system uses microsoft/DialoGPT-medium by default. You can modify the model in agents.py:
self.model_name = "microsoft/DialoGPT-medium" # Change this line
Alternative Models
For faster inference or different capabilities, consider:
microsoft/DialoGPT-small(117M parameters)gpt2(124M parameters)distilbert-base-uncased(66M parameters)
Performance Tuning
Adjust generation parameters in agents.py:
self.generator = pipeline(
"text-generation",
model=self.model,
tokenizer=self.tokenizer,
max_new_tokens=200, # Adjust for longer/shorter outputs
do_sample=True,
temperature=0.7, # Lower = more focused, Higher = more creative
pad_token_id=self.tokenizer.eos_token_id
)
π Deployment
Hugging Face Spaces
- Create a new Space on Hugging Face
- Upload your files to the Space
- Set the Space SDK to Gradio
- Configure the Space with appropriate hardware requirements
Docker Deployment
Build the Docker image:
docker build -t debias-llm .Run the container:
docker run -p 7860:7860 debias-llm
Cloud Deployment
The application can be deployed on:
- Google Colab (with modifications)
- AWS SageMaker
- Azure ML
- Google Cloud Run
π Understanding the Output
Agent 1: Full-Case Diagnosis
- Purpose: Comprehensive initial assessment using all available information
- Input: History of Present Illness + Past Medical History + Physical Examination
- Output: Initial diagnosis with clinical reasoning
Agent 2: Independent Devil's Advocate
- Purpose: Independent evaluation and overlap assessment
- Phase 1: Diagnosis based only on current symptoms and physical exam
- Phase 2: Overlap score (High/Medium/Low) with past medical history
- Output: Independent diagnosis + overlap score + rationale
Agent 3: Final Synthesis
- Purpose: Combines both perspectives for balanced final assessment
- Approach: Evidence-based synthesis with impact analysis
- Output: Final diagnosis + differential + impact of past disease + next steps
π§ Bias Types Demonstrated
- Anchoring Bias: Focusing on initial symptoms or first impressions
- Confirmation Bias: Seeking information that confirms initial diagnosis
- Availability Bias: Overweighting recent or memorable conditions
- Overconfidence Bias: Making definitive diagnoses too quickly
π Troubleshooting
Common Issues
Model Loading Errors
- Ensure sufficient RAM (4GB+)
- Check internet connection for model download
- Verify transformers library version
Generation Errors
- Check input text length and format
- Verify model compatibility
- Review error logs in console
Performance Issues
- Use smaller models for faster inference
- Reduce max_new_tokens parameter
- Consider GPU acceleration if available
Error Handling
The system includes comprehensive error handling:
- Graceful fallbacks for model failures
- Clear error messages for users
- Logging for debugging
π Learning Resources
Medical Decision Making
- Cognitive biases in clinical reasoning
- Multi-perspective diagnostic approaches
- Evidence-based medicine principles
AI and Bias
- Algorithmic bias detection
- Multi-agent systems
- Bias mitigation strategies
Technical Implementation
- Hugging Face Transformers
- Gradio web applications
- Python multi-agent systems
π€ Contributing
Contributions are welcome! Areas for improvement:
- Additional Bias Types: Implement more cognitive biases
- Enhanced Models: Integrate larger, more capable models
- UI Improvements: Better visualization of bias patterns
- Case Library: Expand sample medical cases
- Performance: Optimize for faster inference
π License
This project is for educational and demonstration purposes. Please ensure compliance with local regulations regarding medical AI systems.
β οΈ Disclaimer
Important: This is a demonstration system for educational purposes only. The AI agents simulate medical reasoning but should not be used for actual clinical decision-making. Always consult qualified healthcare professionals for medical advice.
π Support
For questions or issues:
- Check the troubleshooting section
- Review the code comments
- Open an issue in the repository
- Contact the development team
Built with β€οΈ for medical education and AI bias research