agent-mcp-sql / SYSTEM_OVERVIEW.md
Timothy Eastridge
commit streamlit
f831e98

A newer version of the Streamlit SDK is available: 1.51.0

Upgrade

Quick start prompt:

Scan the repo: list the directory tree, key config files, and required env vars. Summarize how to start the system using existing scripts. Do not modify files; just report and wait for confirmation before any changes. Proceed with a fresh start by leveraging: powershell -ExecutionPolicy Bypass -File ops/scripts/fresh_start.ps1

Graph-Driven Agentic System with Human-in-the-Loop Controls

What This System Is

This is a production-ready agentic workflow orchestration system that demonstrates how to build AI agents with human oversight and complete audit trails. The system combines:

  • πŸ€– Autonomous AI Agent: Processes natural language queries and generates SQL
  • πŸ“Š Graph Database: Neo4j stores all workflow metadata and audit trails
  • ⏸️ Human-in-the-Loop: Configurable pause points for human review and intervention
  • 🎯 Single API Gateway: All operations routed through MCP (Model Context Protocol) server
  • 🌐 Real-time Interface: React frontend with live workflow visualization
  • πŸ” Complete Observability: Every action logged with timestamps and relationships

What It Does

Core Workflow

  1. User asks a question in natural language via the web interface
  2. System creates a workflow with multiple instruction steps in Neo4j
  3. Agent discovers the question and begins processing
  4. Pause for human review (5 minutes by default, configurable)
  5. Human can edit instructions during pause via Neo4j Browser
  6. Agent generates SQL from natural language using LLM
  7. Agent executes SQL against PostgreSQL database
  8. Results displayed in formatted table with complete audit trail

Architecture Components

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Frontend  │────│ MCP Server  │────│    Neo4j    β”‚
β”‚  (Next.js)  β”‚    β”‚ (FastAPI)   β”‚    β”‚  (Graph)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚    Agent    │────│ PostgreSQL  β”‚
                    β”‚  (Python)   β”‚    β”‚   (Data)    β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  • Neo4j Graph Database: Stores workflows, instructions, executions, and logs
  • MCP Server: FastAPI gateway for all Neo4j operations with parameter fixing
  • Python Agent: Polls for instructions, pauses for human input, executes tasks
  • PostgreSQL: Sample data source for SQL generation and execution
  • Next.js Frontend: Chat interface with Cytoscape.js graph visualization

Why It's Valuable

🎯 Demonstrates Production Patterns

  • Human Oversight: Shows how to build AI systems with meaningful human control
  • Audit Trails: Complete graph-based logging of all operations and decisions
  • Error Recovery: System continues gracefully after interruptions or edits
  • Scalable Architecture: Clean separation of concerns, containerized deployment

πŸ”„ Agentic Workflow Orchestration

  • Graph-Driven: Workflows stored as connected nodes, not brittle state machines
  • Dynamic Editing: Instructions can be modified during execution
  • Sequence Management: Proper instruction chaining and dependency handling
  • Status Tracking: Real-time visibility into workflow progress

πŸ›‘οΈ Human-in-the-Loop Controls

  • Configurable Pauses: Built-in review periods before critical operations
  • Live Editing: Modify AI behavior during execution via graph database
  • Stop Controls: Terminate workflows at any point
  • Parameter Updates: Change questions, settings, or instructions mid-flight

πŸ“Š Complete Observability

  • Graph Visualization: Real-time workflow progress with color-coded status
  • Audit Logging: Every MCP operation logged with timestamps
  • Execution Tracking: Full history of what was generated and executed
  • Result Storage: All outputs preserved in queryable graph format

πŸš€ Production Ready

  • Containerized: Full Docker Compose setup with health checks
  • Environment Configuration: Flexible .env-based configuration
  • Error Handling: Graceful failures and recovery mechanisms
  • Documentation: Comprehensive setup, usage, and troubleshooting guides

How to Make It Run

Quick Start (5 minutes)

# 1. Clone and navigate to the repo
git clone <repository-url>
cd <repository-name>

# 2. Copy environment template
cp .env.example .env

# 3. Add your LLM API key to .env
# Edit .env and set: LLM_API_KEY=your-openai-or-anthropic-key-here

# 4. Start everything
docker-compose up -d

# 5. Seed Neo4j with demo data (IMPORTANT!)
docker-compose exec mcp python /app/ops/scripts/seed.py

# 6. Open the interface
# Frontend: http://localhost:3000
# Neo4j Browser: http://localhost:7474 (neo4j/password)

Database Seeding Options

Basic Seeding (Quick demo):

docker-compose exec mcp python /app/ops/scripts/seed.py

Creates:

  • Demo Workflow: A 3-step process (discover schema β†’ generate SQL β†’ review results)
  • Query Examples: 3 basic SQL templates for testing
  • Graph Structure: Proper relationships between components

Comprehensive Seeding (Full system):

docker-compose exec mcp python /app/ops/scripts/seed_comprehensive.py

Creates:

  • Workflow Templates: Multiple workflow patterns (basic query, analysis, reporting)
  • Instruction Type Library: 6 different instruction types with schemas
  • Query Library: 6+ categorized SQL examples (basic, analytics, detailed)
  • Demo Workflows: Ready-to-run and template workflows
  • System Configuration: Default settings and supported features

⚠️ Fresh Installation: On a brand-new machine, Neo4j starts completely empty. You MUST run a seed script to have any workflows or instructions to interact with.

πŸ’‘ Recommendation: Use comprehensive seeding for full system exploration, basic seeding for quick demos.

PowerShell Fresh Start (Windows)

# Fresh deployment with API key
powershell -ExecutionPolicy Bypass -File ops/scripts/fresh_start.ps1 -ApiKey "your-api-key-here"

# Or run the demo (assumes system is already running)
powershell -ExecutionPolicy Bypass -File ops/scripts/demo.ps1

Manual Health Check

# Check all services
docker-compose ps

# Validate system
docker-compose exec mcp python /app/ops/scripts/validate.py

# Monitor logs
docker-compose logs -f agent

Test the System

  1. Open http://localhost:3000
  2. Ask a question: "How many customers do we have?"
  3. Watch the workflow:
    • Graph visualization shows progress
    • Agent pauses for 5 minutes
    • You can edit instructions in Neo4j Browser
    • Results appear in formatted table

Clean Reset

# Stop and clean everything
docker-compose down
docker-compose up -d
docker-compose exec mcp python /app/ops/scripts/seed.py

Key Features for Developers

Graph Database Schema

  • Workflow nodes: High-level process containers
  • Instruction nodes: Individual tasks with parameters and status
  • Execution nodes: Results of instruction processing
  • Log nodes: Audit trail of all MCP operations
  • Relationships: HAS_INSTRUCTION, EXECUTED_AS, NEXT_INSTRUCTION

Configuration Options

  • Pause Duration: PAUSE_DURATION in .env (default: 300 seconds)
  • Polling Interval: AGENT_POLL_INTERVAL in .env (default: 30 seconds)
  • LLM Model: LLM_MODEL in .env (gpt-4, claude-3-sonnet, etc.)

Extension Points

  • New Instruction Types: Add handlers in agent/main.py
  • Custom Data Sources: Extend MCP server with new connectors
  • Frontend Customization: Modify React components in frontend/app/
  • Workflow Templates: Create reusable instruction sequences

Human Intervention Examples

// Find pending instructions
MATCH (i:Instruction {status: 'pending'}) RETURN i

// Change a question
MATCH (i:Instruction {type: 'generate_sql', status: 'pending'})
SET i.parameters = '{"question": "Show me top 10 customers by revenue"}'

// Stop a workflow
MATCH (w:Workflow {status: 'active'})
SET w.status = 'stopped'

Development Setup

Prerequisites

  • Docker & Docker Compose
  • OpenAI or Anthropic API key
  • Modern web browser

Project Structure

β”œβ”€β”€ agent/          # Python agent that executes instructions
β”œβ”€β”€ frontend/       # Next.js chat interface
β”œβ”€β”€ mcp/           # FastAPI server for Neo4j operations
β”œβ”€β”€ neo4j/         # Neo4j configuration
β”œβ”€β”€ postgres/      # PostgreSQL setup with sample data
β”œβ”€β”€ ops/scripts/   # Operational scripts (seed, validate, demo)
β”œβ”€β”€ docker-compose.yml
β”œβ”€β”€ Makefile       # Convenience commands
└── README.md      # Detailed documentation

Available Commands

# If you have make installed
make up          # Start all services
make seed        # Create demo data
make health      # Check service health
make logs        # View all logs
make clean       # Reset everything

# Using docker-compose directly
docker-compose up -d
docker-compose exec mcp python /app/ops/scripts/seed.py
docker-compose ps
docker-compose logs -f
docker-compose down

Use Cases

🏒 Enterprise AI Governance

  • Audit trails for compliance
  • Human oversight for critical decisions
  • Risk management in AI operations

πŸ”¬ Research & Development

  • Experiment with agentic workflows
  • Study human-AI collaboration patterns
  • Prototype autonomous systems with safety controls

πŸ“š Educational Examples

  • Demonstrate production AI architecture
  • Teach graph database concepts
  • Show containerized deployment patterns

πŸ› οΈ Template for New Projects

  • Fork as starting point for agentic systems
  • Adapt components for specific domains
  • Scale architecture for production workloads

This system demonstrates that AI agents can be both autonomous and controllable, providing the benefits of automation while maintaining human oversight and complete transparency.