debias-llm / README.md
nglebm19's picture
Enhance README with detailed agent roles and installation
0bf5137 unverified
|
raw
history blame
7.79 kB
# πŸ₯ Devil's Advocate Multi-Agent Medical Analysis System
A demonstration web application that shows how multiple AI agents can overcome diagnostic bias by simulating a clinical review process. This system demonstrates the power of multi-agent collaboration in reducing cognitive biases in medical decision-making.
## 🎯 Project Overview
This demo showcases a three-agent system designed to simulate and overcome common diagnostic biases:
1. **Agent 1 (Diagnostician)**: Provides initial diagnosis using all available information (HPI + PMH + Physical Exam)
2. **Agent 2 (Independent Devil's Advocate)**: Diagnoses from symptoms and physical exam only, then evaluates overlap with past medical history
3. **Agent 3 (Synthesizer)**: Combines both perspectives to create improved final diagnosis with impact analysis
## πŸš€ Key Features
- **Multi-Agent Architecture**: Three specialized AI agents working in sequence
- **Bias Detection**: Agent 2 independently evaluates symptoms vs. past medical history
- **Overlap Scoring**: Qualitative assessment (High/Medium/Low) of current symptoms vs. past conditions
- **Interactive Web Interface**: Clean, intuitive Gradio-based UI
- **Sample Medical Cases**: Pre-built cases demonstrating different bias types
- **Custom Case Input**: Support for user-defined medical scenarios
- **Real LLM Outputs**: All agents generate concrete diagnostic content using Hugging Face models
## πŸ—οΈ Architecture
```
User Input β†’ Agent 1 (Diagnostician) β†’ Agent 2 (Devil's Advocate) β†’ Agent 3 (Synthesizer) β†’ Final Output
↓ ↓ ↓
Full-case Diagnosis Symptoms+Exam Dx + Balanced Synthesis +
(HPI + PMH + Exam) Overlap Score Impact Analysis
```
## πŸ“‹ Prerequisites
- Python 3.8 or higher
- 4GB+ RAM (for model loading)
- Internet connection (for initial model download)
## πŸ› οΈ Installation
1. **Clone the repository:**
```bash
git clone https://github.com/nglebm19/debias-llm.git
cd debias-llm
```
2. **Create a virtual environment (recommended):**
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. **Install dependencies:**
```bash
pip install -r requirements.txt
```
## πŸš€ Usage
### Local Development
1. **Run the application:**
```bash
python app.py
```
2. **Open your browser** and navigate to `http://localhost:7860`
3. **Select a sample case** from the dropdown or input your own medical case
4. **Click "Run Analysis"** to see the three-agent process in action
5. **Review the results** to see how each agent contributes to the final diagnosis
### Sample Cases
The system includes four pre-built medical cases demonstrating different bias types:
- **Case 1**: Resolved Appendicitis with New Symptoms (Anchoring bias)
- **Case 2**: Previous Heart Condition with Current Respiratory Issues (Confirmation bias)
- **Case 3**: Resolved Infection with Persistent Symptoms (Availability bias)
- **Case 4**: Chronic Condition with Acute Exacerbation (Anchoring bias)
## πŸ”§ Configuration
### Model Selection
The system uses `microsoft/DialoGPT-medium` by default. You can modify the model in `agents.py`:
```python
self.model_name = "microsoft/DialoGPT-medium" # Change this line
```
### Alternative Models
For faster inference or different capabilities, consider:
- `microsoft/DialoGPT-small` (117M parameters)
- `gpt2` (124M parameters)
- `distilbert-base-uncased` (66M parameters)
### Performance Tuning
Adjust generation parameters in `agents.py`:
```python
self.generator = pipeline(
"text-generation",
model=self.model,
tokenizer=self.tokenizer,
max_new_tokens=200, # Adjust for longer/shorter outputs
do_sample=True,
temperature=0.7, # Lower = more focused, Higher = more creative
pad_token_id=self.tokenizer.eos_token_id
)
```
## 🌐 Deployment
### Hugging Face Spaces
1. **Create a new Space** on Hugging Face
2. **Upload your files** to the Space
3. **Set the Space SDK** to Gradio
4. **Configure the Space** with appropriate hardware requirements
### Docker Deployment
1. **Build the Docker image:**
```bash
docker build -t debias-llm .
```
2. **Run the container:**
```bash
docker run -p 7860:7860 debias-llm
```
### Cloud Deployment
The application can be deployed on:
- **Google Colab** (with modifications)
- **AWS SageMaker**
- **Azure ML**
- **Google Cloud Run**
## πŸ“Š Understanding the Output
### Agent 1: Full-Case Diagnosis
- **Purpose**: Comprehensive initial assessment using all available information
- **Input**: History of Present Illness + Past Medical History + Physical Examination
- **Output**: Initial diagnosis with clinical reasoning
### Agent 2: Independent Devil's Advocate
- **Purpose**: Independent evaluation and overlap assessment
- **Phase 1**: Diagnosis based only on current symptoms and physical exam
- **Phase 2**: Overlap score (High/Medium/Low) with past medical history
- **Output**: Independent diagnosis + overlap score + rationale
### Agent 3: Final Synthesis
- **Purpose**: Combines both perspectives for balanced final assessment
- **Approach**: Evidence-based synthesis with impact analysis
- **Output**: Final diagnosis + differential + impact of past disease + next steps
## 🧠 Bias Types Demonstrated
1. **Anchoring Bias**: Focusing on initial symptoms or first impressions
2. **Confirmation Bias**: Seeking information that confirms initial diagnosis
3. **Availability Bias**: Overweighting recent or memorable conditions
4. **Overconfidence Bias**: Making definitive diagnoses too quickly
## πŸ” Troubleshooting
### Common Issues
1. **Model Loading Errors**
- Ensure sufficient RAM (4GB+)
- Check internet connection for model download
- Verify transformers library version
2. **Generation Errors**
- Check input text length and format
- Verify model compatibility
- Review error logs in console
3. **Performance Issues**
- Use smaller models for faster inference
- Reduce max_new_tokens parameter
- Consider GPU acceleration if available
### Error Handling
The system includes comprehensive error handling:
- Graceful fallbacks for model failures
- Clear error messages for users
- Logging for debugging
## πŸ“š Learning Resources
### Medical Decision Making
- Cognitive biases in clinical reasoning
- Multi-perspective diagnostic approaches
- Evidence-based medicine principles
### AI and Bias
- Algorithmic bias detection
- Multi-agent systems
- Bias mitigation strategies
### Technical Implementation
- Hugging Face Transformers
- Gradio web applications
- Python multi-agent systems
## 🀝 Contributing
Contributions are welcome! Areas for improvement:
1. **Additional Bias Types**: Implement more cognitive biases
2. **Enhanced Models**: Integrate larger, more capable models
3. **UI Improvements**: Better visualization of bias patterns
4. **Case Library**: Expand sample medical cases
5. **Performance**: Optimize for faster inference
## πŸ“„ License
This project is for educational and demonstration purposes. Please ensure compliance with local regulations regarding medical AI systems.
## ⚠️ Disclaimer
**Important**: This is a demonstration system for educational purposes only. The AI agents simulate medical reasoning but should not be used for actual clinical decision-making. Always consult qualified healthcare professionals for medical advice.
## πŸ“ž Support
For questions or issues:
1. Check the troubleshooting section
2. Review the code comments
3. Open an issue in the repository
4. Contact the development team
---
**Built with ❀️ for medical education and AI bias research**