Spaces:

nglebm19
/

debias-llm

Sleeping

App Files Files Community

debias-llm / README.md

nglebm19

Enhance README with detailed agent roles and installation

0bf5137 unverified 3 months ago

preview code

raw

history blame

7.79 kB

	# 🏥 Devil's Advocate Multi-Agent Medical Analysis System

	A demonstration web application that shows how multiple AI agents can overcome diagnostic bias by simulating a clinical review process. This system demonstrates the power of multi-agent collaboration in reducing cognitive biases in medical decision-making.

	## 🎯 Project Overview

	This demo showcases a three-agent system designed to simulate and overcome common diagnostic biases:

	1. Agent 1 (Diagnostician): Provides initial diagnosis using all available information (HPI + PMH + Physical Exam)
	2. Agent 2 (Independent Devil's Advocate): Diagnoses from symptoms and physical exam only, then evaluates overlap with past medical history
	3. Agent 3 (Synthesizer): Combines both perspectives to create improved final diagnosis with impact analysis

	## 🚀 Key Features

	- Multi-Agent Architecture: Three specialized AI agents working in sequence
	- Bias Detection: Agent 2 independently evaluates symptoms vs. past medical history
	- Overlap Scoring: Qualitative assessment (High/Medium/Low) of current symptoms vs. past conditions
	- Interactive Web Interface: Clean, intuitive Gradio-based UI
	- Sample Medical Cases: Pre-built cases demonstrating different bias types
	- Custom Case Input: Support for user-defined medical scenarios
	- Real LLM Outputs: All agents generate concrete diagnostic content using Hugging Face models

	## 🏗️ Architecture

	```
	User Input → Agent 1 (Diagnostician) → Agent 2 (Devil's Advocate) → Agent 3 (Synthesizer) → Final Output
	↓ ↓ ↓
	Full-case Diagnosis Symptoms+Exam Dx + Balanced Synthesis +
	(HPI + PMH + Exam) Overlap Score Impact Analysis
	```

	## 📋 Prerequisites

	- Python 3.8 or higher
	- 4GB+ RAM (for model loading)
	- Internet connection (for initial model download)

	## 🛠️ Installation

	1. Clone the repository:
	```bash
	git clone https://github.com/nglebm19/debias-llm.git
	cd debias-llm
	```

	2. Create a virtual environment (recommended):
	```bash
	python -m venv venv
	source venv/bin/activate # On Windows: venv\Scripts\activate
	```

	3. Install dependencies:
	```bash
	pip install -r requirements.txt
	```

	## 🚀 Usage

	### Local Development

	1. Run the application:
	```bash
	python app.py
	```

	2. Open your browser and navigate to `http://localhost:7860`

	3. Select a sample case from the dropdown or input your own medical case

	4. Click "Run Analysis" to see the three-agent process in action

	5. Review the results to see how each agent contributes to the final diagnosis

	### Sample Cases

	The system includes four pre-built medical cases demonstrating different bias types:

	- Case 1: Resolved Appendicitis with New Symptoms (Anchoring bias)
	- Case 2: Previous Heart Condition with Current Respiratory Issues (Confirmation bias)
	- Case 3: Resolved Infection with Persistent Symptoms (Availability bias)
	- Case 4: Chronic Condition with Acute Exacerbation (Anchoring bias)

	## 🔧 Configuration

	### Model Selection

	The system uses `microsoft/DialoGPT-medium` by default. You can modify the model in `agents.py`:

	```python
	self.model_name = "microsoft/DialoGPT-medium" # Change this line
	```

	### Alternative Models

	For faster inference or different capabilities, consider:

	- `microsoft/DialoGPT-small` (117M parameters)
	- `gpt2` (124M parameters)
	- `distilbert-base-uncased` (66M parameters)

	### Performance Tuning

	Adjust generation parameters in `agents.py`:

	```python
	self.generator = pipeline(
	"text-generation",
	model=self.model,
	tokenizer=self.tokenizer,
	max_new_tokens=200, # Adjust for longer/shorter outputs
	do_sample=True,
	temperature=0.7, # Lower = more focused, Higher = more creative
	pad_token_id=self.tokenizer.eos_token_id
	)
	```

	## 🌐 Deployment

	### Hugging Face Spaces

	1. Create a new Space on Hugging Face
	2. Upload your files to the Space
	3. Set the Space SDK to Gradio
	4. Configure the Space with appropriate hardware requirements

	### Docker Deployment

	1. Build the Docker image:
	```bash
	docker build -t debias-llm .
	```

	2. Run the container:
	```bash
	docker run -p 7860:7860 debias-llm
	```

	### Cloud Deployment

	The application can be deployed on:
	- Google Colab (with modifications)
	- AWS SageMaker
	- Azure ML
	- Google Cloud Run

	## 📊 Understanding the Output

	### Agent 1: Full-Case Diagnosis
	- Purpose: Comprehensive initial assessment using all available information
	- Input: History of Present Illness + Past Medical History + Physical Examination
	- Output: Initial diagnosis with clinical reasoning

	### Agent 2: Independent Devil's Advocate
	- Purpose: Independent evaluation and overlap assessment
	- Phase 1: Diagnosis based only on current symptoms and physical exam
	- Phase 2: Overlap score (High/Medium/Low) with past medical history
	- Output: Independent diagnosis + overlap score + rationale

	### Agent 3: Final Synthesis
	- Purpose: Combines both perspectives for balanced final assessment
	- Approach: Evidence-based synthesis with impact analysis
	- Output: Final diagnosis + differential + impact of past disease + next steps

	## 🧠 Bias Types Demonstrated

	1. Anchoring Bias: Focusing on initial symptoms or first impressions
	2. Confirmation Bias: Seeking information that confirms initial diagnosis
	3. Availability Bias: Overweighting recent or memorable conditions
	4. Overconfidence Bias: Making definitive diagnoses too quickly

	## 🔍 Troubleshooting

	### Common Issues

	1. Model Loading Errors
	- Ensure sufficient RAM (4GB+)
	- Check internet connection for model download
	- Verify transformers library version

	2. Generation Errors
	- Check input text length and format
	- Verify model compatibility
	- Review error logs in console

	3. Performance Issues
	- Use smaller models for faster inference
	- Reduce max_new_tokens parameter
	- Consider GPU acceleration if available

	### Error Handling

	The system includes comprehensive error handling:
	- Graceful fallbacks for model failures
	- Clear error messages for users
	- Logging for debugging

	## 📚 Learning Resources

	### Medical Decision Making
	- Cognitive biases in clinical reasoning
	- Multi-perspective diagnostic approaches
	- Evidence-based medicine principles

	### AI and Bias
	- Algorithmic bias detection
	- Multi-agent systems
	- Bias mitigation strategies

	### Technical Implementation
	- Hugging Face Transformers
	- Gradio web applications
	- Python multi-agent systems

	## 🤝 Contributing

	Contributions are welcome! Areas for improvement:

	1. Additional Bias Types: Implement more cognitive biases
	2. Enhanced Models: Integrate larger, more capable models
	3. UI Improvements: Better visualization of bias patterns
	4. Case Library: Expand sample medical cases
	5. Performance: Optimize for faster inference

	## 📄 License

	This project is for educational and demonstration purposes. Please ensure compliance with local regulations regarding medical AI systems.

	## ⚠️ Disclaimer

	Important: This is a demonstration system for educational purposes only. The AI agents simulate medical reasoning but should not be used for actual clinical decision-making. Always consult qualified healthcare professionals for medical advice.

	## 📞 Support

	For questions or issues:
	1. Check the troubleshooting section
	2. Review the code comments
	3. Open an issue in the repository
	4. Contact the development team

	---

	Built with ❤️ for medical education and AI bias research