debias-llm / README_github.md
nglebm19's picture
Updating the README.md file for matching HF Spaces
20b348a

A newer version of the Gradio SDK is available: 5.49.1

Upgrade

πŸ₯ Devil's Advocate Multi-Agent Medical Analysis System

A demonstration web application that shows how multiple AI agents can overcome diagnostic bias by simulating a clinical review process. This system demonstrates the power of multi-agent collaboration in reducing cognitive biases in medical decision-making.

🎯 Project Overview

This demo showcases a three-agent system designed to simulate and overcome common diagnostic biases:

  1. Agent 1 (Diagnostician): Provides initial diagnosis using all available information (HPI + PMH + Physical Exam)
  2. Agent 2 (Independent Devil's Advocate): Diagnoses from symptoms and physical exam only, then evaluates overlap with past medical history
  3. Agent 3 (Synthesizer): Combines both perspectives to create improved final diagnosis with impact analysis

πŸš€ Key Features

  • Multi-Agent Architecture: Three specialized AI agents working in sequence
  • Bias Detection: Agent 2 independently evaluates symptoms vs. past medical history
  • Overlap Scoring: Qualitative assessment (High/Medium/Low) of current symptoms vs. past conditions
  • Interactive Web Interface: Clean, intuitive Gradio-based UI
  • Sample Medical Cases: Pre-built cases demonstrating different bias types
  • Custom Case Input: Support for user-defined medical scenarios
  • Real LLM Outputs: All agents generate concrete diagnostic content using Hugging Face models

πŸ—οΈ Architecture

User Input β†’ Agent 1 (Diagnostician) β†’ Agent 2 (Devil's Advocate) β†’ Agent 3 (Synthesizer) β†’ Final Output
                ↓                           ↓                           ↓
            Full-case Diagnosis      Symptoms+Exam Dx +        Balanced Synthesis +
            (HPI + PMH + Exam)      Overlap Score            Impact Analysis

πŸ“‹ Prerequisites

  • Python 3.8 or higher
  • 4GB+ RAM (for model loading)
  • Internet connection (for initial model download)

πŸ› οΈ Installation

  1. Clone the repository:

    git clone https://github.com/nglebm19/debias-llm.git
    cd debias-llm
    
  2. Create a virtual environment (recommended):

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    
  3. Install dependencies:

    pip install -r requirements.txt
    

πŸš€ Usage

Local Development

  1. Run the application:

    python app.py
    
  2. Open your browser and navigate to http://localhost:7860

  3. Select a sample case from the dropdown or input your own medical case

  4. Click "Run Analysis" to see the three-agent process in action

  5. Review the results to see how each agent contributes to the final diagnosis

Sample Cases

The system includes four pre-built medical cases demonstrating different bias types:

  • Case 1: Resolved Appendicitis with New Symptoms (Anchoring bias)
  • Case 2: Previous Heart Condition with Current Respiratory Issues (Confirmation bias)
  • Case 3: Resolved Infection with Persistent Symptoms (Availability bias)
  • Case 4: Chronic Condition with Acute Exacerbation (Anchoring bias)

πŸ”§ Configuration

Model Selection

The system uses microsoft/DialoGPT-medium by default. You can modify the model in agents.py:

self.model_name = "microsoft/DialoGPT-medium"  # Change this line

Alternative Models

For faster inference or different capabilities, consider:

  • microsoft/DialoGPT-small (117M parameters)
  • gpt2 (124M parameters)
  • distilbert-base-uncased (66M parameters)

Performance Tuning

Adjust generation parameters in agents.py:

self.generator = pipeline(
    "text-generation",
    model=self.model,
    tokenizer=self.tokenizer,
    max_new_tokens=200,        # Adjust for longer/shorter outputs
    do_sample=True,
    temperature=0.7,           # Lower = more focused, Higher = more creative
    pad_token_id=self.tokenizer.eos_token_id
)

🌐 Deployment

Hugging Face Spaces

  1. Create a new Space on Hugging Face
  2. Upload your files to the Space
  3. Set the Space SDK to Gradio
  4. Configure the Space with appropriate hardware requirements

Docker Deployment

  1. Build the Docker image:

    docker build -t debias-llm .
    
  2. Run the container:

    docker run -p 7860:7860 debias-llm
    

Cloud Deployment

The application can be deployed on:

  • Google Colab (with modifications)
  • AWS SageMaker
  • Azure ML
  • Google Cloud Run

πŸ“Š Understanding the Output

Agent 1: Full-Case Diagnosis

  • Purpose: Comprehensive initial assessment using all available information
  • Input: History of Present Illness + Past Medical History + Physical Examination
  • Output: Initial diagnosis with clinical reasoning

Agent 2: Independent Devil's Advocate

  • Purpose: Independent evaluation and overlap assessment
  • Phase 1: Diagnosis based only on current symptoms and physical exam
  • Phase 2: Overlap score (High/Medium/Low) with past medical history
  • Output: Independent diagnosis + overlap score + rationale

Agent 3: Final Synthesis

  • Purpose: Combines both perspectives for balanced final assessment
  • Approach: Evidence-based synthesis with impact analysis
  • Output: Final diagnosis + differential + impact of past disease + next steps

🧠 Bias Types Demonstrated

  1. Anchoring Bias: Focusing on initial symptoms or first impressions
  2. Confirmation Bias: Seeking information that confirms initial diagnosis
  3. Availability Bias: Overweighting recent or memorable conditions
  4. Overconfidence Bias: Making definitive diagnoses too quickly

πŸ” Troubleshooting

Common Issues

  1. Model Loading Errors

    • Ensure sufficient RAM (4GB+)
    • Check internet connection for model download
    • Verify transformers library version
  2. Generation Errors

    • Check input text length and format
    • Verify model compatibility
    • Review error logs in console
  3. Performance Issues

    • Use smaller models for faster inference
    • Reduce max_new_tokens parameter
    • Consider GPU acceleration if available

Error Handling

The system includes comprehensive error handling:

  • Graceful fallbacks for model failures
  • Clear error messages for users
  • Logging for debugging

πŸ“š Learning Resources

Medical Decision Making

  • Cognitive biases in clinical reasoning
  • Multi-perspective diagnostic approaches
  • Evidence-based medicine principles

AI and Bias

  • Algorithmic bias detection
  • Multi-agent systems
  • Bias mitigation strategies

Technical Implementation

  • Hugging Face Transformers
  • Gradio web applications
  • Python multi-agent systems

🀝 Contributing

Contributions are welcome! Areas for improvement:

  1. Additional Bias Types: Implement more cognitive biases
  2. Enhanced Models: Integrate larger, more capable models
  3. UI Improvements: Better visualization of bias patterns
  4. Case Library: Expand sample medical cases
  5. Performance: Optimize for faster inference

πŸ“„ License

This project is for educational and demonstration purposes. Please ensure compliance with local regulations regarding medical AI systems.

⚠️ Disclaimer

Important: This is a demonstration system for educational purposes only. The AI agents simulate medical reasoning but should not be used for actual clinical decision-making. Always consult qualified healthcare professionals for medical advice.

πŸ“ž Support

For questions or issues:

  1. Check the troubleshooting section
  2. Review the code comments
  3. Open an issue in the repository
  4. Contact the development team

Built with ❀️ for medical education and AI bias research