Spaces:

ohmygaugh
/

agent-mcp-sql

No application file

App Files Files Community

Timothy Eastridge commited on Sep 23

Commit

da2713e

1 Parent(s): ead5455

clean up on seeding and system overview

Browse files

Files changed (7) hide show

SYSTEM_OVERVIEW.md +263 -0
app_requirements/6_feature_Streamlit.md +64 -0
app_requirements/{6_feature_parking_lot_items.txt → 99_feature_parking_lot_items.txt} +0 -0
ops/scripts/demo.ps1 +2 -1
ops/scripts/fresh_start.ps1 +170 -0
ops/scripts/fresh_start_check.py +256 -0
ops/scripts/seed_comprehensive.py +387 -0

SYSTEM_OVERVIEW.md ADDED Viewed

	@@ -0,0 +1,263 @@

+# Graph-Driven Agentic System with Human-in-the-Loop Controls
+## What This System Is
+This is a **production-ready agentic workflow orchestration system** that demonstrates how to build AI agents with human oversight and complete audit trails. The system combines:
+- **🤖 Autonomous AI Agent**: Processes natural language queries and generates SQL
+- **📊 Graph Database**: Neo4j stores all workflow metadata and audit trails
+- **⏸️ Human-in-the-Loop**: Configurable pause points for human review and intervention
+- **🎯 Single API Gateway**: All operations routed through MCP (Model Context Protocol) server
+- **🌐 Real-time Interface**: React frontend with live workflow visualization
+- **🔍 Complete Observability**: Every action logged with timestamps and relationships
+## What It Does
+### Core Workflow
+1. **User asks a question** in natural language via the web interface
+2. **System creates a workflow** with multiple instruction steps in Neo4j
+3. **Agent discovers the question** and begins processing
+4. **Pause for human review** (5 minutes by default, configurable)
+5. **Human can edit instructions** during pause via Neo4j Browser
+6. **Agent generates SQL** from natural language using LLM
+7. **Agent executes SQL** against PostgreSQL database
+8. **Results displayed** in formatted table with complete audit trail
+### Architecture Components
+```
+┌─────────────┐    ┌─────────────┐    ┌─────────────┐
+│   Frontend  │────│ MCP Server  │────│    Neo4j    │
+│  (Next.js)  │    │ (FastAPI)   │    │  (Graph)    │
+└─────────────┘    └─────────────┘    └─────────────┘
+                           │
+                    ┌─────────────┐    ┌─────────────┐
+                    │    Agent    │────│ PostgreSQL  │
+                    │  (Python)   │    │   (Data)    │
+                    └─────────────┘    └─────────────┘
+```
+- **Neo4j Graph Database**: Stores workflows, instructions, executions, and logs
+- **MCP Server**: FastAPI gateway for all Neo4j operations with parameter fixing
+- **Python Agent**: Polls for instructions, pauses for human input, executes tasks
+- **PostgreSQL**: Sample data source for SQL generation and execution
+- **Next.js Frontend**: Chat interface with Cytoscape.js graph visualization
+## Why It's Valuable
+### 🎯 **Demonstrates Production Patterns**
+- **Human Oversight**: Shows how to build AI systems with meaningful human control
+- **Audit Trails**: Complete graph-based logging of all operations and decisions
+- **Error Recovery**: System continues gracefully after interruptions or edits
+- **Scalable Architecture**: Clean separation of concerns, containerized deployment
+### 🔄 **Agentic Workflow Orchestration**
+- **Graph-Driven**: Workflows stored as connected nodes, not brittle state machines
+- **Dynamic Editing**: Instructions can be modified during execution
+- **Sequence Management**: Proper instruction chaining and dependency handling
+- **Status Tracking**: Real-time visibility into workflow progress
+### 🛡️ **Human-in-the-Loop Controls**
+- **Configurable Pauses**: Built-in review periods before critical operations
+- **Live Editing**: Modify AI behavior during execution via graph database
+- **Stop Controls**: Terminate workflows at any point
+- **Parameter Updates**: Change questions, settings, or instructions mid-flight
+### 📊 **Complete Observability**
+- **Graph Visualization**: Real-time workflow progress with color-coded status
+- **Audit Logging**: Every MCP operation logged with timestamps
+- **Execution Tracking**: Full history of what was generated and executed
+- **Result Storage**: All outputs preserved in queryable graph format
+### 🚀 **Production Ready**
+- **Containerized**: Full Docker Compose setup with health checks
+- **Environment Configuration**: Flexible .env-based configuration
+- **Error Handling**: Graceful failures and recovery mechanisms
+- **Documentation**: Comprehensive setup, usage, and troubleshooting guides
+## How to Make It Run
+### Quick Start (5 minutes)
+```bash
+# 1. Clone and navigate to the repo
+git clone <repository-url>
+cd <repository-name>
+# 2. Copy environment template
+cp .env.example .env
+# 3. Add your LLM API key to .env
+# Edit .env and set: LLM_API_KEY=your-openai-or-anthropic-key-here
+# 4. Start everything
+docker-compose up -d
+# 5. Seed Neo4j with demo data (IMPORTANT!)
+docker-compose exec mcp python /app/ops/scripts/seed.py
+# 6. Open the interface
+# Frontend: http://localhost:3000
+# Neo4j Browser: http://localhost:7474 (neo4j/password)
+```
+### Database Seeding Options
+**Basic Seeding** (Quick demo):
+```bash
+docker-compose exec mcp python /app/ops/scripts/seed.py
+```
+Creates:
+- **Demo Workflow**: A 3-step process (discover schema → generate SQL → review results)
+- **Query Examples**: 3 basic SQL templates for testing
+- **Graph Structure**: Proper relationships between components
+**Comprehensive Seeding** (Full system):
+```bash
+docker-compose exec mcp python /app/ops/scripts/seed_comprehensive.py
+```
+Creates:
+- **Workflow Templates**: Multiple workflow patterns (basic query, analysis, reporting)
+- **Instruction Type Library**: 6 different instruction types with schemas
+- **Query Library**: 6+ categorized SQL examples (basic, analytics, detailed)
+- **Demo Workflows**: Ready-to-run and template workflows
+- **System Configuration**: Default settings and supported features
+**⚠️ Fresh Installation**: On a brand-new machine, Neo4j starts completely empty. You MUST run a seed script to have any workflows or instructions to interact with.
+**💡 Recommendation**: Use comprehensive seeding for full system exploration, basic seeding for quick demos.
+### PowerShell Fresh Start (Windows)
+```powershell
+# Fresh deployment with API key
+powershell -ExecutionPolicy Bypass -File ops/scripts/fresh_start.ps1 -ApiKey "your-api-key-here"
+# Or run the demo (assumes system is already running)
+powershell -ExecutionPolicy Bypass -File ops/scripts/demo.ps1
+```
+### Manual Health Check
+```bash
+# Check all services
+docker-compose ps
+# Validate system
+docker-compose exec mcp python /app/ops/scripts/validate.py
+# Monitor logs
+docker-compose logs -f agent
+```
+### Test the System
+1. **Open http://localhost:3000**
+2. **Ask a question**: "How many customers do we have?"
+3. **Watch the workflow**:
+   - Graph visualization shows progress
+   - Agent pauses for 5 minutes
+   - You can edit instructions in Neo4j Browser
+   - Results appear in formatted table
+### Clean Reset
+```bash
+# Stop and clean everything
+docker-compose down
+docker-compose up -d
+docker-compose exec mcp python /app/ops/scripts/seed.py
+```
+## Key Features for Developers
+### Graph Database Schema
+- **Workflow** nodes: High-level process containers
+- **Instruction** nodes: Individual tasks with parameters and status
+- **Execution** nodes: Results of instruction processing
+- **Log** nodes: Audit trail of all MCP operations
+- **Relationships**: `HAS_INSTRUCTION`, `EXECUTED_AS`, `NEXT_INSTRUCTION`
+### Configuration Options
+- **Pause Duration**: `PAUSE_DURATION` in .env (default: 300 seconds)
+- **Polling Interval**: `AGENT_POLL_INTERVAL` in .env (default: 30 seconds)
+- **LLM Model**: `LLM_MODEL` in .env (gpt-4, claude-3-sonnet, etc.)
+### Extension Points
+- **New Instruction Types**: Add handlers in `agent/main.py`
+- **Custom Data Sources**: Extend MCP server with new connectors
+- **Frontend Customization**: Modify React components in `frontend/app/`
+- **Workflow Templates**: Create reusable instruction sequences
+### Human Intervention Examples
+```cypher
+// Find pending instructions
+MATCH (i:Instruction {status: 'pending'}) RETURN i
+// Change a question
+MATCH (i:Instruction {type: 'generate_sql', status: 'pending'})
+SET i.parameters = '{"question": "Show me top 10 customers by revenue"}'
+// Stop a workflow
+MATCH (w:Workflow {status: 'active'})
+SET w.status = 'stopped'
+```
+## Development Setup
+### Prerequisites
+- Docker & Docker Compose
+- OpenAI or Anthropic API key
+- Modern web browser
+### Project Structure
+```
+├── agent/          # Python agent that executes instructions
+├── frontend/       # Next.js chat interface
+├── mcp/           # FastAPI server for Neo4j operations
+├── neo4j/         # Neo4j configuration
+├── postgres/      # PostgreSQL setup with sample data
+├── ops/scripts/   # Operational scripts (seed, validate, demo)
+├── docker-compose.yml
+├── Makefile       # Convenience commands
+└── README.md      # Detailed documentation
+```
+### Available Commands
+```bash
+# If you have make installed
+make up          # Start all services
+make seed        # Create demo data
+make health      # Check service health
+make logs        # View all logs
+make clean       # Reset everything
+# Using docker-compose directly
+docker-compose up -d
+docker-compose exec mcp python /app/ops/scripts/seed.py
+docker-compose ps
+docker-compose logs -f
+docker-compose down
+```
+## Use Cases
+### 🏢 **Enterprise AI Governance**
+- Audit trails for compliance
+- Human oversight for critical decisions
+- Risk management in AI operations
+### 🔬 **Research & Development**
+- Experiment with agentic workflows
+- Study human-AI collaboration patterns
+- Prototype autonomous systems with safety controls
+### 📚 **Educational Examples**
+- Demonstrate production AI architecture
+- Teach graph database concepts
+- Show containerized deployment patterns
+### 🛠️ **Template for New Projects**
+- Fork as starting point for agentic systems
+- Adapt components for specific domains
+- Scale architecture for production workloads
+---
+**This system demonstrates that AI agents can be both autonomous and controllable, providing the benefits of automation while maintaining human oversight and complete transparency.**

app_requirements/6_feature_Streamlit.md ADDED Viewed

	@@ -0,0 +1,64 @@

+Feature 6: Lightweight Streamlit MCP Monitor & Query Tester
+6. Feature: Streamlit MCP Monitor & Query Tester
+6.1 Story: As a developer, I need a lightweight Streamlit app to monitor MCP connections and test the agentic query engine.
+6.1.1 Task: Create /streamlit/requirements.txt with streamlit==1.28.0, requests==2.31.0, pandas==2.1.0, python-dotenv==1.0.0 - NO neo4j driver as all Neo4j access MUST go through MCP server
+6.1.2 Task: Create /streamlit/app.py that ONLY communicates with MCP server at http://mcp:8000/mcp, never directly to Neo4j - use st.set_page_config(page_title="MCP Monitor", layout="wide"), create two tabs using st.tabs(["🔌 Connection Status", "🤖 Query Tester"])
+6.1.3 Task: Create /streamlit/Dockerfile with FROM python:3.11-slim, WORKDIR /app, install requirements, expose port 8501, CMD ["streamlit", "run", "app.py", "--server.address=0.0.0.0"]
+6.1.4 Task: Add streamlit service to docker-compose.yml with build: ./streamlit, ports: 8501:8501, depends_on: mcp, environment: MCP_URL=http://mcp:8000/mcp, MCP_API_KEY=dev-key-123, explicitly NOT including NEO4J_BOLT_URL as direct access is forbidden
+6.2 Story: As a user, I need to monitor the health and performance of all MCP connections in real-time.
+6.2.1 Task: Create connection test function that calls MCP tools ONLY - test Neo4j via call_mcp("get_schema"), test PostgreSQL via call_mcp("query_postgres", {"query": "SELECT 1"}), never use direct database connections
+6.2.2 Task: Display connection status in 3-column layout using st.columns(), show Neo4j (via MCP), PostgreSQL (via MCP), MCP Server status with st.metric(label, value="Online"/"Offline", delta=f"{response_ms}ms")
+6.2.3 Task: Implement auto-refresh using st.empty() placeholder with while True loop, time.sleep(5), st.rerun() to update every 5 seconds, show "Last checked: {timestamp}" with st.caption()
+6.2.4 Task: Add manual refresh button with st.button("🔄 Refresh Now") that immediately re-runs all MCP-based connection tests and updates metrics
+6.2.5 Task: Query performance stats via MCP call_mcp("query_graph", {"query": "MATCH (l:Log) WHERE l.timestamp > datetime() - duration('PT1H') RETURN count(l) as count"}), display with st.info() - emphasize this is the ONLY way to query Neo4j
+6.3 Story: As a user, I need to test natural language queries through the agentic engine without using the main chat interface.
+6.3.1 Task: Create query input section with st.text_area("Enter your question:", height=100), st.button("🚀 Execute Query", type="primary") to trigger workflow creation via MCP server only
+6.3.2 Task: On execute, create workflow via call_mcp("write_graph", {action: "create_node", label: "Workflow", properties: {...}}), then create instructions via MCP write_graph, store workflow_id in st.session_state - all graph writes MUST use MCP
+6.3.3 Task: Poll workflow status via call_mcp("query_graph", {"query": "MATCH (i:Instruction) WHERE i.workflow_id=$id RETURN i.status", "parameters": {"id": workflow_id}}) every 2 seconds, update st.progress() based on results
+6.3.4 Task: Fetch results via call_mcp("query_graph", {"query": "MATCH (e:Execution) WHERE e.workflow_id=$id RETURN e.result", "parameters": {"id": workflow_id}}), display SQL in st.code() and data in st.dataframe()
+6.3.5 Task: Add "Clear Results" button that resets st.session_state.workflow_id and clears displayed results, ready for next query
+6.4 Story: As a developer, I need to examine the agentic process flow to understand how answers are derived.
+6.4.1 Task: Fetch execution trace via call_mcp("query_graph", {"query": "MATCH (w:Workflow {id: $id})-[:HAS_INSTRUCTION]->(i)-[:EXECUTED_AS]->(e) RETURN i, e ORDER BY i.sequence", "parameters": {"id": workflow_id}}) - this is the ONLY way to get execution data
+6.4.2 Task: Display each step in expandable sections using st.expander(f"Step {i.sequence}: {i.type}"), show instruction parameters, execution times, and status from MCP query results
+6.4.3 Task: For SQL generation steps, query schema context via call_mcp("query_graph", {"query": "MATCH (t:Table)-[:HAS_COLUMN]->(c:Column) RETURN t.name, collect(c.name)"}), display in st.code() to show what LLM received
+6.4.4 Task: Show execution timeline by calculating time differences from execution nodes returned by MCP query_graph, display as: "Schema Discovery (5s) → [Pause 30s] → SQL Generation (3s) → Results (1s)"
+6.4.5 Task: Add "View Raw Execution Data" toggle that shows full JSON response from call_mcp("query_graph", {"query": "MATCH (e:Execution {workflow_id: $id}) RETURN e"}), displayed with st.json()
+6.5 Story: As a developer, I need the Streamlit app to handle errors gracefully and provide useful debugging information.
+6.5.1 Task: Wrap all call_mcp() invocations in try/except blocks, on exception show st.error(f"MCP Server Error: {str(e)}") emphasizing no direct database access is possible
+6.5.2 Task: Implement retry logic for failed MCP calls with 3 attempts and exponential backoff, show st.warning("Retrying MCP connection...") during retries, cache last successful response
+6.5.3 Task: Add debug panel with st.expander("🔧 Debug Information") showing last 5 MCP requests/responses from st.session_state.debug_log, emphasize all database operations go through MCP
+6.5.4 Task: On workflow execution failure, query error details via call_mcp("query_graph", {"query": "MATCH (e:Execution {status: 'failed'}) RETURN e.error"}), display with suggestions for common issues
+6.5.5 Task: Create MCP diagnostics on startup - if call_mcp("get_schema") fails, show st.error("Cannot reach Neo4j through MCP server. The app cannot directly connect to Neo4j - all access must go through MCP at {MCP_URL}")
+Critical Implementation Notes
+The Streamlit app MUST:
+NEVER import or use neo4j Python driver
+NEVER import or use psycopg2 directly
+ONLY communicate with databases through MCP server endpoints
+ALWAYS use call_mcp() for any data retrieval or storage
+EXPLICITLY show in error messages that direct database access is not permitted
+Example of CORRECT implementation:
+python# ✅ CORRECT - All Neo4j access through MCP
+def get_workflow_status(workflow_id):
+    result, _ = call_mcp("query_graph", {
+        "query": "MATCH (w:Workflow {id: $id}) RETURN w.status",
+        "parameters": {"id": workflow_id}
+    })
+    return result['data'][0]['status'] if result else None
+Example of INCORRECT implementation:
+python# ❌ WRONG - Direct Neo4j access is forbidden
+from neo4j import GraphDatabase
+driver = GraphDatabase.driver("bolt://neo4j:7687")  # NEVER DO THIS
+This ensures the Streamlit app respects the architecture principle that all Neo4j access MUST go through the MCP server gateway, maintaining the single point of control and audit trail.

app_requirements/{6_feature_parking_lot_items.txt → 99_feature_parking_lot_items.txt} RENAMED Viewed

File without changes

ops/scripts/demo.ps1 CHANGED Viewed

@@ -60,7 +60,8 @@ if ($agentStatus) {
 }
 Write-Host ""
-Write-Host "Step 3: Seeding demo workflow..." -ForegroundColor Blue
 docker-compose exec mcp python /app/ops/scripts/seed.py
 Write-Host ""

 }
 Write-Host ""
+Write-Host "Step 3: Seeding Neo4j database..." -ForegroundColor Blue
+Write-Host "  (This populates the empty graph with demo workflows)" -ForegroundColor Gray
 docker-compose exec mcp python /app/ops/scripts/seed.py
 Write-Host ""

ops/scripts/fresh_start.ps1 ADDED Viewed

	@@ -0,0 +1,170 @@

+# Fresh Start Script for Graph-Driven Agentic System (Windows PowerShell)
+# Run this script to deploy the system from scratch
+param(
+    [string]$ApiKey = "",
+    [string]$Model = "gpt-4"
+)
+Write-Host "🚀 Fresh Start Deployment" -ForegroundColor Green
+Write-Host "=========================" -ForegroundColor Green
+Write-Host ""
+# Step 1: Check prerequisites
+Write-Host "📋 Step 1: Checking prerequisites..." -ForegroundColor Blue
+try {
+    docker --version | Out-Null
+    Write-Host "✅ Docker found" -ForegroundColor Green
+} catch {
+    Write-Host "❌ Docker not found. Please install Docker Desktop" -ForegroundColor Red
+    exit 1
+}
+try {
+    docker-compose --version | Out-Null
+    Write-Host "✅ Docker Compose found" -ForegroundColor Green
+} catch {
+    Write-Host "❌ Docker Compose not found" -ForegroundColor Red
+    exit 1
+}
+# Step 2: Setup environment
+Write-Host ""
+Write-Host "⚙️ Step 2: Setting up environment..." -ForegroundColor Blue
+if (Test-Path ".env") {
+    Write-Host "⚠️ .env file already exists, backing up to .env.backup" -ForegroundColor Yellow
+    Copy-Item ".env" ".env.backup"
+}
+Copy-Item ".env.example" ".env"
+Write-Host "✅ Created .env from template" -ForegroundColor Green
+# Update API key if provided
+if ($ApiKey -ne "") {
+    (Get-Content ".env") -replace "LLM_API_KEY=.*", "LLM_API_KEY=$ApiKey" | Set-Content ".env"
+    (Get-Content ".env") -replace "LLM_MODEL=.*", "LLM_MODEL=$Model" | Set-Content ".env"
+    Write-Host "✅ Updated LLM configuration in .env" -ForegroundColor Green
+} else {
+    Write-Host "⚠️ No API key provided. You'll need to edit .env manually" -ForegroundColor Yellow
+    Write-Host "   Add your OpenAI or Anthropic API key to the LLM_API_KEY variable" -ForegroundColor Gray
+}
+# Step 3: Clean existing containers
+Write-Host ""
+Write-Host "🧹 Step 3: Cleaning existing containers..." -ForegroundColor Blue
+docker-compose down 2>$null
+docker system prune -f 2>$null
+Write-Host "✅ Cleaned existing containers" -ForegroundColor Green
+# Step 4: Build services
+Write-Host ""
+Write-Host "🔨 Step 4: Building services..." -ForegroundColor Blue
+docker-compose build
+if ($LASTEXITCODE -ne 0) {
+    Write-Host "❌ Build failed" -ForegroundColor Red
+    exit 1
+}
+Write-Host "✅ All services built successfully" -ForegroundColor Green
+# Step 5: Start services
+Write-Host ""
+Write-Host "🚀 Step 5: Starting services..." -ForegroundColor Blue
+docker-compose up -d
+if ($LASTEXITCODE -ne 0) {
+    Write-Host "❌ Failed to start services" -ForegroundColor Red
+    exit 1
+}
+Write-Host "✅ All services started" -ForegroundColor Green
+# Step 6: Wait for health checks
+Write-Host ""
+Write-Host "⏳ Step 6: Waiting for services to be healthy (60 seconds)..." -ForegroundColor Blue
+$healthyServices = 0
+$maxWait = 60
+$elapsed = 0
+while ($elapsed -lt $maxWait -and $healthyServices -lt 3) {
+    Start-Sleep 5
+    $elapsed += 5
+    $healthyServices = 0
+    # Check Neo4j
+    try {
+        docker-compose exec neo4j cypher-shell -u neo4j -p password "MATCH (n) RETURN count(n) LIMIT 1" 2>$null | Out-Null
+        if ($LASTEXITCODE -eq 0) { $healthyServices++ }
+    } catch {}
+    # Check PostgreSQL
+    try {
+        docker-compose exec postgres pg_isready -U postgres 2>$null | Out-Null
+        if ($LASTEXITCODE -eq 0) { $healthyServices++ }
+    } catch {}
+    # Check MCP
+    try {
+        $response = Invoke-WebRequest -Uri "http://localhost:8000/health" -UseBasicParsing -TimeoutSec 2 2>$null
+        if ($response.StatusCode -eq 200) { $healthyServices++ }
+    } catch {}
+    Write-Host "  Healthy services: $healthyServices/3 (${elapsed}s elapsed)" -ForegroundColor Gray
+}
+if ($healthyServices -eq 3) {
+    Write-Host "✅ All core services are healthy" -ForegroundColor Green
+} else {
+    Write-Host "⚠️ Some services may not be fully ready, but continuing..." -ForegroundColor Yellow
+}
+# Step 7: Seed database
+Write-Host ""
+Write-Host "🌱 Step 7: Seeding Neo4j database..." -ForegroundColor Blue
+docker-compose exec mcp python /app/ops/scripts/seed.py
+if ($LASTEXITCODE -eq 0) {
+    Write-Host "✅ Database seeded successfully" -ForegroundColor Green
+} else {
+    Write-Host "❌ Database seeding failed" -ForegroundColor Red
+    Write-Host "   You can try manual seeding later with:" -ForegroundColor Gray
+    Write-Host "   docker-compose exec mcp python /app/ops/scripts/seed.py" -ForegroundColor Gray
+}
+# Step 8: Final status check
+Write-Host ""
+Write-Host "📊 Step 8: Final status check..." -ForegroundColor Blue
+docker-compose ps
+# Step 9: Success message
+Write-Host ""
+Write-Host "🎉 DEPLOYMENT COMPLETE!" -ForegroundColor Green
+Write-Host "======================" -ForegroundColor Green
+Write-Host ""
+Write-Host "📱 Access Points:" -ForegroundColor Yellow
+Write-Host "  • Frontend Interface: http://localhost:3000" -ForegroundColor White
+Write-Host "  • Neo4j Browser: http://localhost:7474" -ForegroundColor White
+Write-Host "    Login: neo4j / password" -ForegroundColor Gray
+Write-Host ""
+Write-Host "🔧 Management Commands:" -ForegroundColor Yellow
+Write-Host "  • View logs: docker-compose logs -f" -ForegroundColor White
+Write-Host "  • Stop system: docker-compose down" -ForegroundColor White
+Write-Host "  • Check health: docker-compose ps" -ForegroundColor White
+Write-Host ""
+Write-Host "🎯 Quick Test:" -ForegroundColor Yellow
+Write-Host "  1. Open http://localhost:3000" -ForegroundColor White
+Write-Host "  2. Ask: 'How many customers do we have?'" -ForegroundColor White
+Write-Host "  3. Watch the agent process the workflow!" -ForegroundColor White
+Write-Host ""
+if ($ApiKey -eq "") {
+    Write-Host "⚠️ REMINDER: Update your LLM API key in .env before testing!" -ForegroundColor Yellow
+    Write-Host ""
+}
+Write-Host "System is ready for use! 🚀" -ForegroundColor Green

ops/scripts/fresh_start_check.py ADDED Viewed

	@@ -0,0 +1,256 @@

+#!/usr/bin/env python3
+"""
+Fresh Start Validation Script
+Checks all requirements for launching the system from scratch
+"""
+import os
+import sys
+import subprocess
+import json
+def check_file_exists(filepath, description):
+    """Check if a critical file exists"""
+    if os.path.exists(filepath):
+        print(f"✅ {description}: {filepath}")
+        return True
+    else:
+        print(f"❌ MISSING {description}: {filepath}")
+        return False
+def check_docker_files():
+    """Check all Docker-related files"""
+    print("🐳 Checking Docker files...")
+    files = [
+        ("docker-compose.yml", "Main orchestration file"),
+        ("agent/Dockerfile", "Agent service Docker config"),
+        ("mcp/Dockerfile", "MCP service Docker config"),
+        ("frontend/Dockerfile", "Frontend service Docker config"),
+        ("neo4j/Dockerfile", "Neo4j service Docker config"),
+        (".env.example", "Environment template")
+    ]
+    all_good = True
+    for filepath, desc in files:
+        all_good &= check_file_exists(filepath, desc)
+    return all_good
+def check_frontend_files():
+    """Check frontend critical files"""
+    print("\n🌐 Checking Frontend files...")
+    files = [
+        ("frontend/package.json", "Frontend dependencies"),
+        ("frontend/tsconfig.json", "TypeScript config"),
+        ("frontend/tailwind.config.js", "Tailwind CSS config"),
+        ("frontend/next.config.js", "Next.js config"),
+        ("frontend/app/page.tsx", "Main chat interface"),
+        ("frontend/app/layout.tsx", "Root layout"),
+        ("frontend/types/cytoscape-fcose.d.ts", "Cytoscape types")
+    ]
+    all_good = True
+    for filepath, desc in files:
+        all_good &= check_file_exists(filepath, desc)
+    return all_good
+def check_backend_files():
+    """Check backend service files"""
+    print("\n🔧 Checking Backend files...")
+    files = [
+        ("agent/main.py", "Agent service main file"),
+        ("agent/requirements.txt", "Agent dependencies"),
+        ("mcp/main.py", "MCP service main file"),
+        ("mcp/requirements.txt", "MCP dependencies"),
+        ("postgres/init.sql", "PostgreSQL initialization"),
+    ]
+    all_good = True
+    for filepath, desc in files:
+        all_good &= check_file_exists(filepath, desc)
+    return all_good
+def check_operational_files():
+    """Check operational scripts"""
+    print("\n🛠️ Checking Operational files...")
+    files = [
+        ("ops/scripts/seed.py", "Basic seeding script"),
+        ("ops/scripts/seed_comprehensive.py", "Comprehensive seeding script"),
+        ("ops/scripts/validate.py", "System validation script"),
+        ("ops/scripts/demo.ps1", "PowerShell demo script"),
+        ("Makefile", "Build automation"),
+        ("README.md", "Main documentation"),
+        ("SYSTEM_OVERVIEW.md", "System overview")
+    ]
+    all_good = True
+    for filepath, desc in files:
+        all_good &= check_file_exists(filepath, desc)
+    return all_good
+def check_env_variables():
+    """Check if .env.example has all required variables"""
+    print("\n⚙️ Checking Environment variables...")
+    if not os.path.exists(".env.example"):
+        print("❌ .env.example file missing")
+        return False
+    with open(".env.example", "r") as f:
+        env_content = f.read()
+    required_vars = [
+        "NEO4J_AUTH",
+        "NEO4J_BOLT_URL",
+        "POSTGRES_PASSWORD",
+        "POSTGRES_CONNECTION",
+        "MCP_API_KEYS",
+        "MCP_PORT",
+        "AGENT_POLL_INTERVAL",
+        "PAUSE_DURATION",
+        "LLM_API_KEY",
+        "LLM_MODEL"
+    ]
+    all_good = True
+    for var in required_vars:
+        if var in env_content:
+            print(f"✅ Environment variable: {var}")
+        else:
+            print(f"❌ MISSING environment variable: {var}")
+            all_good = False
+    return all_good
+def check_docker_compose_structure():
+    """Check docker-compose.yml structure"""
+    print("\n🔗 Checking Docker Compose structure...")
+    if not os.path.exists("docker-compose.yml"):
+        print("❌ docker-compose.yml missing")
+        return False
+    try:
+        import yaml
+        with open("docker-compose.yml", "r") as f:
+            compose = yaml.safe_load(f)
+        required_services = ["neo4j", "postgres", "mcp", "agent", "frontend"]
+        all_good = True
+        for service in required_services:
+            if service in compose.get("services", {}):
+                print(f"✅ Service defined: {service}")
+            else:
+                print(f"❌ MISSING service: {service}")
+                all_good = False
+        return all_good
+    except ImportError:
+        print("⚠️ PyYAML not available, skipping structure check")
+        return True
+    except Exception as e:
+        print(f"❌ Error parsing docker-compose.yml: {e}")
+        return False
+def check_package_json():
+    """Check frontend package.json for required dependencies"""
+    print("\n📦 Checking Frontend dependencies...")
+    if not os.path.exists("frontend/package.json"):
+        print("❌ frontend/package.json missing")
+        return False
+    with open("frontend/package.json", "r") as f:
+        package = json.load(f)
+    required_deps = [
+        "next", "react", "react-dom", "typescript",
+        "cytoscape", "cytoscape-fcose", "tailwindcss"
+    ]
+    all_deps = {**package.get("dependencies", {}), **package.get("devDependencies", {})}
+    all_good = True
+    for dep in required_deps:
+        if dep in all_deps:
+            print(f"✅ Frontend dependency: {dep}")
+        else:
+            print(f"❌ MISSING frontend dependency: {dep}")
+            all_good = False
+    return all_good
+def generate_startup_commands():
+    """Generate the exact commands for fresh startup"""
+    print("\n🚀 Fresh Startup Commands:")
+    print("=" * 50)
+    print("# 1. Copy environment file")
+    print("cp .env.example .env")
+    print("")
+    print("# 2. Edit .env and add your LLM API key")
+    print("# LLM_API_KEY=your-openai-or-anthropic-key-here")
+    print("")
+    print("# 3. Clean any existing containers")
+    print("docker-compose down")
+    print("docker system prune -f")
+    print("")
+    print("# 4. Build and start services")
+    print("docker-compose build")
+    print("docker-compose up -d")
+    print("")
+    print("# 5. Wait for services to be healthy (30 seconds)")
+    print("Start-Sleep 30")
+    print("")
+    print("# 6. Seed the database")
+    print("docker-compose exec mcp python /app/ops/scripts/seed.py")
+    print("")
+    print("# 7. Open the interface")
+    print("# Frontend: http://localhost:3000")
+    print("# Neo4j Browser: http://localhost:7474 (neo4j/password)")
+    print("=" * 50)
+def main():
+    print("🔍 FRESH START VALIDATION")
+    print("========================")
+    print("")
+    checks = [
+        ("Docker Files", check_docker_files),
+        ("Frontend Files", check_frontend_files),
+        ("Backend Files", check_backend_files),
+        ("Operational Files", check_operational_files),
+        ("Environment Variables", check_env_variables),
+        ("Docker Compose Structure", check_docker_compose_structure),
+        ("Frontend Dependencies", check_package_json)
+    ]
+    all_passed = True
+    for check_name, check_func in checks:
+        try:
+            result = check_func()
+            all_passed &= result
+        except Exception as e:
+            print(f"❌ ERROR in {check_name}: {e}")
+            all_passed = False
+    print("\n" + "=" * 50)
+    if all_passed:
+        print("✅ ALL CHECKS PASSED!")
+        print("System is ready for fresh deployment")
+        generate_startup_commands()
+    else:
+        print("❌ SOME CHECKS FAILED")
+        print("Please fix the missing files/configurations before deploying")
+    print("=" * 50)
+    return all_passed
+if __name__ == "__main__":
+    success = main()
+    sys.exit(0 if success else 1)

ops/scripts/seed_comprehensive.py ADDED Viewed

	@@ -0,0 +1,387 @@

+#!/usr/bin/env python3
+"""
+Comprehensive seed script for the agentic system.
+Creates multiple workflow templates and instruction types for various scenarios.
+"""
+import requests
+import json
+import time
+import os
+# Configuration
+MCP_URL = os.getenv("MCP_URL", "http://localhost:8000/mcp")
+API_KEY = os.getenv("MCP_API_KEY", "dev-key-123")
+def call_mcp(tool, params=None):
+    """Call the MCP API"""
+    response = requests.post(
+        MCP_URL,
+        headers={"X-API-Key": API_KEY, "Content-Type": "application/json"},
+        json={"tool": tool, "params": params or {}}
+    )
+    return response.json()
+def create_workflow_templates():
+    """Create different workflow templates for various use cases"""
+    print("🌱 Creating workflow templates...")
+    workflows = [
+        {
+            "id": "template-basic-query",
+            "name": "Basic Data Query",
+            "description": "Simple question-to-SQL workflow",
+            "status": "template"
+        },
+        {
+            "id": "template-analysis",
+            "name": "Data Analysis Workflow",
+            "description": "Multi-step analysis with validation",
+            "status": "template"
+        },
+        {
+            "id": "template-report",
+            "name": "Report Generation",
+            "description": "Generate formatted reports from data",
+            "status": "template"
+        }
+    ]
+    for workflow in workflows:
+        result = call_mcp("write_graph", {
+            "action": "create_node",
+            "label": "WorkflowTemplate",
+            "properties": {
+                **workflow,
+                "created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ")
+            }
+        })
+        print(f"✅ Created workflow template: {workflow['name']}")
+def create_instruction_types():
+    """Create instruction type definitions"""
+    print("🔧 Creating instruction type definitions...")
+    instruction_types = [
+        {
+            "type": "discover_schema",
+            "name": "Schema Discovery",
+            "description": "Discover and analyze database schema",
+            "default_pause": 60,
+            "parameters_schema": "{}"
+        },
+        {
+            "type": "generate_sql",
+            "name": "SQL Generation",
+            "description": "Convert natural language to SQL",
+            "default_pause": 300,
+            "parameters_schema": json.dumps({
+                "question": "string",
+                "context": "string (optional)"
+            })
+        },
+        {
+            "type": "execute_sql",
+            "name": "SQL Execution",
+            "description": "Execute SQL query against database",
+            "default_pause": 120,
+            "parameters_schema": json.dumps({
+                "query": "string",
+                "limit": "integer (optional)"
+            })
+        },
+        {
+            "type": "validate_results",
+            "name": "Result Validation",
+            "description": "Validate and check query results",
+            "default_pause": 60,
+            "parameters_schema": json.dumps({
+                "validation_rules": "array (optional)"
+            })
+        },
+        {
+            "type": "format_output",
+            "name": "Output Formatting",
+            "description": "Format results for presentation",
+            "default_pause": 30,
+            "parameters_schema": json.dumps({
+                "format": "string (table|chart|json)",
+                "title": "string (optional)"
+            })
+        },
+        {
+            "type": "review_results",
+            "name": "Human Review",
+            "description": "Human review checkpoint",
+            "default_pause": 300,
+            "parameters_schema": "{}"
+        }
+    ]
+    for inst_type in instruction_types:
+        result = call_mcp("write_graph", {
+            "action": "create_node",
+            "label": "InstructionType",
+            "properties": {
+                **inst_type,
+                "created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ")
+            }
+        })
+        print(f"✅ Created instruction type: {inst_type['name']}")
+def create_query_library():
+    """Create a library of common queries"""
+    print("📚 Creating query library...")
+    queries = [
+        {
+            "id": "query-customer-count",
+            "category": "basic",
+            "question": "How many customers do we have?",
+            "sql": "SELECT COUNT(*) as customer_count FROM customers",
+            "description": "Total customer count"
+        },
+        {
+            "id": "query-recent-orders",
+            "category": "basic",
+            "question": "Show me recent orders",
+            "sql": "SELECT o.id, o.order_date, c.name, o.total_amount FROM orders o JOIN customers c ON o.customer_id = c.id ORDER BY o.order_date DESC LIMIT 10",
+            "description": "Last 10 orders with customer info"
+        },
+        {
+            "id": "query-revenue-total",
+            "category": "analytics",
+            "question": "What's our total revenue?",
+            "sql": "SELECT SUM(total_amount) as total_revenue FROM orders",
+            "description": "Sum of all order amounts"
+        },
+        {
+            "id": "query-top-customers",
+            "category": "analytics",
+            "question": "Who are our top customers by revenue?",
+            "sql": "SELECT c.name, c.email, SUM(o.total_amount) as total_spent FROM customers c JOIN orders o ON c.id = o.customer_id GROUP BY c.id, c.name, c.email ORDER BY total_spent DESC LIMIT 5",
+            "description": "Top 5 customers by total spending"
+        },
+        {
+            "id": "query-monthly-trend",
+            "category": "analytics",
+            "question": "Show monthly revenue trend",
+            "sql": "SELECT DATE_TRUNC('month', order_date) as month, SUM(total_amount) as monthly_revenue FROM orders GROUP BY DATE_TRUNC('month', order_date) ORDER BY month",
+            "description": "Revenue by month"
+        },
+        {
+            "id": "query-customer-orders",
+            "category": "detailed",
+            "question": "Show customers with their order details",
+            "sql": "SELECT c.name, c.email, o.order_date, o.total_amount, o.status FROM customers c LEFT JOIN orders o ON c.id = o.customer_id ORDER BY c.name, o.order_date DESC",
+            "description": "Customer and order details"
+        }
+    ]
+    for query in queries:
+        result = call_mcp("write_graph", {
+            "action": "create_node",
+            "label": "QueryTemplate",
+            "properties": {
+                **query,
+                "created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ")
+            }
+        })
+        print(f"✅ Created query: {query['description']}")
+def create_demo_workflows():
+    """Create ready-to-run demo workflows"""
+    print("🎯 Creating demo workflows...")
+    # Demo Workflow 1: Simple Query
+    workflow1 = call_mcp("write_graph", {
+        "action": "create_node",
+        "label": "Workflow",
+        "properties": {
+            "id": "demo-simple-query",
+            "name": "Simple Customer Count",
+            "description": "Demo: Count total customers",
+            "status": "active",
+            "created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ")
+        }
+    })
+    # Instructions for workflow 1
+    inst1 = call_mcp("write_graph", {
+        "action": "create_node",
+        "label": "Instruction",
+        "properties": {
+            "id": "demo-simple-1",
+            "type": "generate_sql",
+            "sequence": 1,
+            "status": "pending",
+            "pause_duration": 60,  # 1 minute for demo
+            "parameters": json.dumps({"question": "How many customers do we have?"}),
+            "created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ")
+        }
+    })
+    # Link instruction to workflow
+    call_mcp("query_graph", {
+        "query": "MATCH (w:Workflow {id: 'demo-simple-query'}), (i:Instruction {id: 'demo-simple-1'}) CREATE (w)-[:HAS_INSTRUCTION]->(i)"
+    })
+    print("✅ Created simple demo workflow")
+    # Demo Workflow 2: Multi-step Analysis
+    workflow2 = call_mcp("write_graph", {
+        "action": "create_node",
+        "label": "Workflow",
+        "properties": {
+            "id": "demo-analysis",
+            "name": "Customer Revenue Analysis",
+            "description": "Demo: Multi-step customer analysis",
+            "status": "template",  # Not active by default
+            "created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ")
+        }
+    })
+    # Multi-step instructions
+    analysis_instructions = [
+        {
+            "id": "demo-analysis-1",
+            "type": "discover_schema",
+            "sequence": 1,
+            "description": "Discover customer and order tables",
+            "parameters": "{}"
+        },
+        {
+            "id": "demo-analysis-2",
+            "type": "generate_sql",
+            "sequence": 2,
+            "description": "Generate customer revenue query",
+            "parameters": json.dumps({"question": "Show me top customers by total revenue"})
+        },
+        {
+            "id": "demo-analysis-3",
+            "type": "review_results",
+            "sequence": 3,
+            "description": "Review results before final output",
+            "parameters": "{}"
+        }
+    ]
+    for inst in analysis_instructions:
+        call_mcp("write_graph", {
+            "action": "create_node",
+            "label": "Instruction",
+            "properties": {
+                **inst,
+                "status": "template",
+                "pause_duration": 120,
+                "created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ")
+            }
+        })
+        # Link to workflow
+        call_mcp("query_graph", {
+            "query": "MATCH (w:Workflow {id: 'demo-analysis'}), (i:Instruction {id: $iid}) CREATE (w)-[:HAS_INSTRUCTION]->(i)",
+            "parameters": {"iid": inst["id"]}
+        })
+    # Create instruction chain
+    for i in range(len(analysis_instructions) - 1):
+        current = analysis_instructions[i]["id"]
+        next_inst = analysis_instructions[i + 1]["id"]
+        call_mcp("query_graph", {
+            "query": "MATCH (i1:Instruction {id: $id1}), (i2:Instruction {id: $id2}) CREATE (i1)-[:NEXT_INSTRUCTION]->(i2)",
+            "parameters": {"id1": current, "id2": next_inst}
+        })
+    print("✅ Created multi-step analysis workflow")
+def create_system_config():
+    """Create system configuration nodes"""
+    print("⚙️ Creating system configuration...")
+    config = {
+        "system_version": "1.0.0",
+        "default_pause_duration": 300,
+        "max_retry_attempts": 3,
+        "default_polling_interval": 30,
+        "supported_instruction_types": json.dumps([
+            "discover_schema", "generate_sql", "execute_sql",
+            "validate_results", "format_output", "review_results"
+        ]),
+        "created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ")
+    }
+    result = call_mcp("write_graph", {
+        "action": "create_node",
+        "label": "SystemConfig",
+        "properties": config
+    })
+    print("✅ Created system configuration")
+def verify_seeding():
+    """Verify all seeded data"""
+    print("\n🔍 Verifying seeded data...")
+    # Count nodes by type
+    counts = call_mcp("query_graph", {
+        "query": """
+            MATCH (n)
+            RETURN labels(n)[0] as label, count(n) as count
+            ORDER BY count DESC
+        """
+    })
+    print("\n📊 Node Statistics:")
+    for item in counts.get("data", []):
+        print(f"  - {item['label']}: {item['count']} nodes")
+    # Check active workflows
+    active_workflows = call_mcp("query_graph", {
+        "query": "MATCH (w:Workflow {status: 'active'}) RETURN w.name as name"
+    })
+    if active_workflows.get("data"):
+        print(f"\n🎯 Active Workflows:")
+        for wf in active_workflows["data"]:
+            print(f"  - {wf['name']}")
+    print(f"\n✅ Comprehensive seeding completed successfully!")
+def main():
+    print("🚀 Starting comprehensive seed process...")
+    # Check services first
+    try:
+        health_response = requests.get(f"{MCP_URL.replace('/mcp', '/health')}", timeout=5)
+        if health_response.status_code != 200:
+            print("❌ MCP service not available")
+            return False
+    except Exception as e:
+        print(f"❌ Service check failed: {e}")
+        return False
+    print("✅ Services are available\n")
+    # Run all seeding functions
+    create_workflow_templates()
+    create_instruction_types()
+    create_query_library()
+    create_demo_workflows()
+    create_system_config()
+    verify_seeding()
+    print("\n📋 What's Available:")
+    print("1. Open http://localhost:3000 - Frontend interface")
+    print("2. Open http://localhost:7474 - Neo4j Browser (neo4j/password)")
+    print("3. Try asking: 'How many customers do we have?'")
+    print("4. Check the 'Customer Revenue Analysis' workflow template")
+    print("5. Explore the query library for more examples")
+    return True
+if __name__ == "__main__":
+    if main():
+        exit(0)
+    else:
+        exit(1)