Spaces:
Sleeping
Sleeping
| # π₯ Devil's Advocate Multi-Agent Medical Analysis System | |
| A demonstration web application that shows how multiple AI agents can overcome diagnostic bias by simulating a clinical review process. This system demonstrates the power of multi-agent collaboration in reducing cognitive biases in medical decision-making. | |
| ## π― Project Overview | |
| This demo showcases a three-agent system designed to simulate and overcome common diagnostic biases: | |
| 1. **Agent 1 (Diagnostician)**: Provides initial diagnosis using all available information (HPI + PMH + Physical Exam) | |
| 2. **Agent 2 (Independent Devil's Advocate)**: Diagnoses from symptoms and physical exam only, then evaluates overlap with past medical history | |
| 3. **Agent 3 (Synthesizer)**: Combines both perspectives to create improved final diagnosis with impact analysis | |
| ## π Key Features | |
| - **Multi-Agent Architecture**: Three specialized AI agents working in sequence | |
| - **Bias Detection**: Agent 2 independently evaluates symptoms vs. past medical history | |
| - **Overlap Scoring**: Qualitative assessment (High/Medium/Low) of current symptoms vs. past conditions | |
| - **Interactive Web Interface**: Clean, intuitive Gradio-based UI | |
| - **Sample Medical Cases**: Pre-built cases demonstrating different bias types | |
| - **Custom Case Input**: Support for user-defined medical scenarios | |
| - **Real LLM Outputs**: All agents generate concrete diagnostic content using Hugging Face models | |
| ## ποΈ Architecture | |
| ``` | |
| User Input β Agent 1 (Diagnostician) β Agent 2 (Devil's Advocate) β Agent 3 (Synthesizer) β Final Output | |
| β β β | |
| Full-case Diagnosis Symptoms+Exam Dx + Balanced Synthesis + | |
| (HPI + PMH + Exam) Overlap Score Impact Analysis | |
| ``` | |
| ## π Prerequisites | |
| - Python 3.8 or higher | |
| - 4GB+ RAM (for model loading) | |
| - Internet connection (for initial model download) | |
| ## π οΈ Installation | |
| 1. **Clone the repository:** | |
| ```bash | |
| git clone https://github.com/nglebm19/debias-llm.git | |
| cd debias-llm | |
| ``` | |
| 2. **Create a virtual environment (recommended):** | |
| ```bash | |
| python -m venv venv | |
| source venv/bin/activate # On Windows: venv\Scripts\activate | |
| ``` | |
| 3. **Install dependencies:** | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ## π Usage | |
| ### Local Development | |
| 1. **Run the application:** | |
| ```bash | |
| python app.py | |
| ``` | |
| 2. **Open your browser** and navigate to `http://localhost:7860` | |
| 3. **Select a sample case** from the dropdown or input your own medical case | |
| 4. **Click "Run Analysis"** to see the three-agent process in action | |
| 5. **Review the results** to see how each agent contributes to the final diagnosis | |
| ### Sample Cases | |
| The system includes four pre-built medical cases demonstrating different bias types: | |
| - **Case 1**: Resolved Appendicitis with New Symptoms (Anchoring bias) | |
| - **Case 2**: Previous Heart Condition with Current Respiratory Issues (Confirmation bias) | |
| - **Case 3**: Resolved Infection with Persistent Symptoms (Availability bias) | |
| - **Case 4**: Chronic Condition with Acute Exacerbation (Anchoring bias) | |
| ## π§ Configuration | |
| ### Model Selection | |
| The system uses `microsoft/DialoGPT-medium` by default. You can modify the model in `agents.py`: | |
| ```python | |
| self.model_name = "microsoft/DialoGPT-medium" # Change this line | |
| ``` | |
| ### Alternative Models | |
| For faster inference or different capabilities, consider: | |
| - `microsoft/DialoGPT-small` (117M parameters) | |
| - `gpt2` (124M parameters) | |
| - `distilbert-base-uncased` (66M parameters) | |
| ### Performance Tuning | |
| Adjust generation parameters in `agents.py`: | |
| ```python | |
| self.generator = pipeline( | |
| "text-generation", | |
| model=self.model, | |
| tokenizer=self.tokenizer, | |
| max_new_tokens=200, # Adjust for longer/shorter outputs | |
| do_sample=True, | |
| temperature=0.7, # Lower = more focused, Higher = more creative | |
| pad_token_id=self.tokenizer.eos_token_id | |
| ) | |
| ``` | |
| ## π Deployment | |
| ### Hugging Face Spaces | |
| 1. **Create a new Space** on Hugging Face | |
| 2. **Upload your files** to the Space | |
| 3. **Set the Space SDK** to Gradio | |
| 4. **Configure the Space** with appropriate hardware requirements | |
| ### Docker Deployment | |
| 1. **Build the Docker image:** | |
| ```bash | |
| docker build -t debias-llm . | |
| ``` | |
| 2. **Run the container:** | |
| ```bash | |
| docker run -p 7860:7860 debias-llm | |
| ``` | |
| ### Cloud Deployment | |
| The application can be deployed on: | |
| - **Google Colab** (with modifications) | |
| - **AWS SageMaker** | |
| - **Azure ML** | |
| - **Google Cloud Run** | |
| ## π Understanding the Output | |
| ### Agent 1: Full-Case Diagnosis | |
| - **Purpose**: Comprehensive initial assessment using all available information | |
| - **Input**: History of Present Illness + Past Medical History + Physical Examination | |
| - **Output**: Initial diagnosis with clinical reasoning | |
| ### Agent 2: Independent Devil's Advocate | |
| - **Purpose**: Independent evaluation and overlap assessment | |
| - **Phase 1**: Diagnosis based only on current symptoms and physical exam | |
| - **Phase 2**: Overlap score (High/Medium/Low) with past medical history | |
| - **Output**: Independent diagnosis + overlap score + rationale | |
| ### Agent 3: Final Synthesis | |
| - **Purpose**: Combines both perspectives for balanced final assessment | |
| - **Approach**: Evidence-based synthesis with impact analysis | |
| - **Output**: Final diagnosis + differential + impact of past disease + next steps | |
| ## π§ Bias Types Demonstrated | |
| 1. **Anchoring Bias**: Focusing on initial symptoms or first impressions | |
| 2. **Confirmation Bias**: Seeking information that confirms initial diagnosis | |
| 3. **Availability Bias**: Overweighting recent or memorable conditions | |
| 4. **Overconfidence Bias**: Making definitive diagnoses too quickly | |
| ## π Troubleshooting | |
| ### Common Issues | |
| 1. **Model Loading Errors** | |
| - Ensure sufficient RAM (4GB+) | |
| - Check internet connection for model download | |
| - Verify transformers library version | |
| 2. **Generation Errors** | |
| - Check input text length and format | |
| - Verify model compatibility | |
| - Review error logs in console | |
| 3. **Performance Issues** | |
| - Use smaller models for faster inference | |
| - Reduce max_new_tokens parameter | |
| - Consider GPU acceleration if available | |
| ### Error Handling | |
| The system includes comprehensive error handling: | |
| - Graceful fallbacks for model failures | |
| - Clear error messages for users | |
| - Logging for debugging | |
| ## π Learning Resources | |
| ### Medical Decision Making | |
| - Cognitive biases in clinical reasoning | |
| - Multi-perspective diagnostic approaches | |
| - Evidence-based medicine principles | |
| ### AI and Bias | |
| - Algorithmic bias detection | |
| - Multi-agent systems | |
| - Bias mitigation strategies | |
| ### Technical Implementation | |
| - Hugging Face Transformers | |
| - Gradio web applications | |
| - Python multi-agent systems | |
| ## π€ Contributing | |
| Contributions are welcome! Areas for improvement: | |
| 1. **Additional Bias Types**: Implement more cognitive biases | |
| 2. **Enhanced Models**: Integrate larger, more capable models | |
| 3. **UI Improvements**: Better visualization of bias patterns | |
| 4. **Case Library**: Expand sample medical cases | |
| 5. **Performance**: Optimize for faster inference | |
| ## π License | |
| This project is for educational and demonstration purposes. Please ensure compliance with local regulations regarding medical AI systems. | |
| ## β οΈ Disclaimer | |
| **Important**: This is a demonstration system for educational purposes only. The AI agents simulate medical reasoning but should not be used for actual clinical decision-making. Always consult qualified healthcare professionals for medical advice. | |
| ## π Support | |
| For questions or issues: | |
| 1. Check the troubleshooting section | |
| 2. Review the code comments | |
| 3. Open an issue in the repository | |
| 4. Contact the development team | |
| --- | |
| **Built with β€οΈ for medical education and AI bias research** | |