Spaces:

ohmygaugh
/

agent-mcp-sql

No application file

App Files Files Community

agent-mcp-sql / SYSTEM_OVERVIEW.md

Timothy Eastridge

commit streamlit

f831e98 about 2 months ago

preview code

raw

history blame contribute delete

10.5 kB

A newer version of the Streamlit SDK is available: 1.51.0

Upgrade

Quick start prompt:

Scan the repo: list the directory tree, key config files, and required env vars. Summarize how to start the system using existing scripts. Do not modify files; just report and wait for confirmation before any changes. Proceed with a fresh start by leveraging: powershell -ExecutionPolicy Bypass -File ops/scripts/fresh_start.ps1

Graph-Driven Agentic System with Human-in-the-Loop Controls

What This System Is

This is a production-ready agentic workflow orchestration system that demonstrates how to build AI agents with human oversight and complete audit trails. The system combines:

🤖 Autonomous AI Agent: Processes natural language queries and generates SQL
📊 Graph Database: Neo4j stores all workflow metadata and audit trails
⏸️ Human-in-the-Loop: Configurable pause points for human review and intervention
🎯 Single API Gateway: All operations routed through MCP (Model Context Protocol) server
🌐 Real-time Interface: React frontend with live workflow visualization
🔍 Complete Observability: Every action logged with timestamps and relationships

What It Does

Core Workflow

User asks a question in natural language via the web interface
System creates a workflow with multiple instruction steps in Neo4j
Agent discovers the question and begins processing
Pause for human review (5 minutes by default, configurable)
Human can edit instructions during pause via Neo4j Browser
Agent generates SQL from natural language using LLM
Agent executes SQL against PostgreSQL database
Results displayed in formatted table with complete audit trail

Architecture Components

┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Frontend  │────│ MCP Server  │────│    Neo4j    │
│  (Next.js)  │    │ (FastAPI)   │    │  (Graph)    │
└─────────────┘    └─────────────┘    └─────────────┘
                           │
                    ┌─────────────┐    ┌─────────────┐
                    │    Agent    │────│ PostgreSQL  │
                    │  (Python)   │    │   (Data)    │
                    └─────────────┘    └─────────────┘

Neo4j Graph Database: Stores workflows, instructions, executions, and logs
MCP Server: FastAPI gateway for all Neo4j operations with parameter fixing
Python Agent: Polls for instructions, pauses for human input, executes tasks
PostgreSQL: Sample data source for SQL generation and execution
Next.js Frontend: Chat interface with Cytoscape.js graph visualization

Why It's Valuable

🎯 Demonstrates Production Patterns

Human Oversight: Shows how to build AI systems with meaningful human control
Audit Trails: Complete graph-based logging of all operations and decisions
Error Recovery: System continues gracefully after interruptions or edits
Scalable Architecture: Clean separation of concerns, containerized deployment

🔄 Agentic Workflow Orchestration

Graph-Driven: Workflows stored as connected nodes, not brittle state machines
Dynamic Editing: Instructions can be modified during execution
Sequence Management: Proper instruction chaining and dependency handling
Status Tracking: Real-time visibility into workflow progress

🛡️ Human-in-the-Loop Controls

Configurable Pauses: Built-in review periods before critical operations
Live Editing: Modify AI behavior during execution via graph database
Stop Controls: Terminate workflows at any point
Parameter Updates: Change questions, settings, or instructions mid-flight

📊 Complete Observability

Graph Visualization: Real-time workflow progress with color-coded status
Audit Logging: Every MCP operation logged with timestamps
Execution Tracking: Full history of what was generated and executed
Result Storage: All outputs preserved in queryable graph format

🚀 Production Ready

Containerized: Full Docker Compose setup with health checks
Environment Configuration: Flexible .env-based configuration
Error Handling: Graceful failures and recovery mechanisms
Documentation: Comprehensive setup, usage, and troubleshooting guides

How to Make It Run

Quick Start (5 minutes)

# 1. Clone and navigate to the repo
git clone <repository-url>
cd <repository-name>

# 2. Copy environment template
cp .env.example .env

# 3. Add your LLM API key to .env
# Edit .env and set: LLM_API_KEY=your-openai-or-anthropic-key-here

# 4. Start everything
docker-compose up -d

# 5. Seed Neo4j with demo data (IMPORTANT!)
docker-compose exec mcp python /app/ops/scripts/seed.py

# 6. Open the interface
# Frontend: http://localhost:3000
# Neo4j Browser: http://localhost:7474 (neo4j/password)

Database Seeding Options

Basic Seeding (Quick demo):

docker-compose exec mcp python /app/ops/scripts/seed.py

Creates:

Demo Workflow: A 3-step process (discover schema → generate SQL → review results)
Query Examples: 3 basic SQL templates for testing
Graph Structure: Proper relationships between components

Comprehensive Seeding (Full system):

docker-compose exec mcp python /app/ops/scripts/seed_comprehensive.py

Creates:

Workflow Templates: Multiple workflow patterns (basic query, analysis, reporting)
Instruction Type Library: 6 different instruction types with schemas
Query Library: 6+ categorized SQL examples (basic, analytics, detailed)
Demo Workflows: Ready-to-run and template workflows
System Configuration: Default settings and supported features

⚠️ Fresh Installation: On a brand-new machine, Neo4j starts completely empty. You MUST run a seed script to have any workflows or instructions to interact with.

💡 Recommendation: Use comprehensive seeding for full system exploration, basic seeding for quick demos.

PowerShell Fresh Start (Windows)

# Fresh deployment with API key
powershell -ExecutionPolicy Bypass -File ops/scripts/fresh_start.ps1 -ApiKey "your-api-key-here"

# Or run the demo (assumes system is already running)
powershell -ExecutionPolicy Bypass -File ops/scripts/demo.ps1

Manual Health Check

# Check all services
docker-compose ps

# Validate system
docker-compose exec mcp python /app/ops/scripts/validate.py

# Monitor logs
docker-compose logs -f agent

Test the System

Open http://localhost:3000
Ask a question: "How many customers do we have?"
Watch the workflow:
- Graph visualization shows progress
- Agent pauses for 5 minutes
- You can edit instructions in Neo4j Browser
- Results appear in formatted table

Clean Reset

# Stop and clean everything
docker-compose down
docker-compose up -d
docker-compose exec mcp python /app/ops/scripts/seed.py

Key Features for Developers

Graph Database Schema

Workflow nodes: High-level process containers
Instruction nodes: Individual tasks with parameters and status
Execution nodes: Results of instruction processing
Log nodes: Audit trail of all MCP operations
Relationships: HAS_INSTRUCTION, EXECUTED_AS, NEXT_INSTRUCTION

Configuration Options

Pause Duration: PAUSE_DURATION in .env (default: 300 seconds)
Polling Interval: AGENT_POLL_INTERVAL in .env (default: 30 seconds)
LLM Model: LLM_MODEL in .env (gpt-4, claude-3-sonnet, etc.)

Extension Points

New Instruction Types: Add handlers in agent/main.py
Custom Data Sources: Extend MCP server with new connectors
Frontend Customization: Modify React components in frontend/app/
Workflow Templates: Create reusable instruction sequences

Human Intervention Examples

// Find pending instructions
MATCH (i:Instruction {status: 'pending'}) RETURN i

// Change a question
MATCH (i:Instruction {type: 'generate_sql', status: 'pending'})
SET i.parameters = '{"question": "Show me top 10 customers by revenue"}'

// Stop a workflow
MATCH (w:Workflow {status: 'active'})
SET w.status = 'stopped'

Development Setup

Prerequisites

Docker & Docker Compose
OpenAI or Anthropic API key
Modern web browser

Project Structure

├── agent/          # Python agent that executes instructions
├── frontend/       # Next.js chat interface
├── mcp/           # FastAPI server for Neo4j operations
├── neo4j/         # Neo4j configuration
├── postgres/      # PostgreSQL setup with sample data
├── ops/scripts/   # Operational scripts (seed, validate, demo)
├── docker-compose.yml
├── Makefile       # Convenience commands
└── README.md      # Detailed documentation

Available Commands

# If you have make installed
make up          # Start all services
make seed        # Create demo data
make health      # Check service health
make logs        # View all logs
make clean       # Reset everything

# Using docker-compose directly
docker-compose up -d
docker-compose exec mcp python /app/ops/scripts/seed.py
docker-compose ps
docker-compose logs -f
docker-compose down

Use Cases

🏢 Enterprise AI Governance

Audit trails for compliance
Human oversight for critical decisions
Risk management in AI operations

🔬 Research & Development

Experiment with agentic workflows
Study human-AI collaboration patterns
Prototype autonomous systems with safety controls

📚 Educational Examples

Demonstrate production AI architecture
Teach graph database concepts
Show containerized deployment patterns

🛠️ Template for New Projects

Fork as starting point for agentic systems
Adapt components for specific domains
Scale architecture for production workloads

This system demonstrates that AI agents can be both autonomous and controllable, providing the benefits of automation while maintaining human oversight and complete transparency.