# Cryptocurrency API Monitor & Resource Aggregator ## Complete End-to-End Project Analysis **Status**: Production Ready **Version**: 1.0.0 **Last Updated**: 2025-11-10 **Repository**: https://github.com/nimazasinich/crypto-dt-source --- ## 1. Executive Summary ### Problem Solved This project provides a **unified monitoring and aggregation system** for cryptocurrency data sources. It solves two critical problems: 1. **API Reliability Monitoring**: Tracks the health, uptime, and performance of 50+ cryptocurrency APIs including blockchain explorers, market data providers, RPC nodes, and news feeds 2. **Centralized API Aggregation**: Provides a single FastAPI/Gradio interface to access multiple cryptocurrency data sources with automatic failover and history tracking ### Main Features - ✅ Real-time health monitoring of 50+ cryptocurrency APIs - ✅ Automatic failover chain management with multi-tier prioritization - ✅ Historical metrics tracking with SQLite persistence - ✅ Interactive Gradio web dashboard with 5 tabs - ✅ RESTful API aggregator with FastAPI backend - ✅ Background scheduling for continuous monitoring (APScheduler) - ✅ Incident detection and alerting for critical services - ✅ Response time analytics and uptime percentage tracking - ✅ CORS proxy support for browser-based applications - ✅ Export functionality (JSON, CSV) ### Target Users - **Cryptocurrency Developers**: Need reliable access to multiple data sources - **DApp Developers**: Require failover mechanisms for critical APIs - **Data Analysts**: Monitor API availability and performance trends - **DevOps Engineers**: Track service health and uptime metrics - **Research Teams**: Need historical data on API reliability ### Current Status **Production Ready** - All components implemented and tested: - ✅ Node.js monitoring system (api-monitor.js, failover-manager.js) - ✅ Python FastAPI aggregator (app.py) - ✅ Python Gradio dashboard (app_gradio.py) - ✅ SQLite database with full schema - ✅ Background scheduler - ✅ Interactive HTML dashboard - ✅ Docker containerization - ✅ Deployment guides for Hugging Face Spaces --- ## 2. Repository Map (Tree) ``` crypto-dt-source/ │ ├── Core Application Files │ ├── api-monitor.js # Node.js health check engine (580 lines) │ ├── failover-manager.js # Automatic failover chain builder (350 lines) │ ├── app.py # FastAPI resource aggregator (592 lines) │ ├── app_gradio.py # Gradio monitoring dashboard (1250+ lines) │ ├── config.py # Configuration & resource loader (192 lines) │ ├── monitor.py # Async health check engine (350+ lines) │ ├── database.py # SQLite persistence layer (481 lines) │ └── scheduler.py # Background APScheduler (132 lines) │ ├── Frontend & UI │ └── dashboard.html # Interactive web dashboard with CSS/JS │ ├── Configuration Files │ ├── all_apis_merged_2025.json # Master API registry (92KB, 162+ endpoints) │ ├── ultimate_crypto_pipeline_2025_NZasinich.json # Pipeline config (18KB) │ ├── package.json # Node.js dependencies │ ├── requirements.txt # Python dependencies │ ├── .env.example # Environment variable template │ └── .gitignore # Git ignore patterns │ ├── Deployment & Infrastructure │ ├── Dockerfile # Docker container config for FastAPI │ ├── DEPLOYMENT_GUIDE.md # Multi-platform deployment instructions │ ├── README.md # Main documentation (1110 lines) │ ├── README_HF_SPACES.md # Hugging Face Spaces guide │ └── PROJECT_SUMMARY.md # Implementation summary │ ├── Testing │ └── test_aggregator.py # API endpoint test suite (50+ lines) │ └── Data & Outputs (Generated at Runtime) ├── data/ │ └── health_metrics.db # SQLite database (created on first run) ├── history.db # Query history database ├── api-monitor-report.json # Latest health check results └── failover-config.json # Failover chain configuration ``` ### Key Files by Purpose **Health Monitoring (Node.js)** - `api-monitor.js`: Main monitoring engine, checks 50+ endpoints - `failover-manager.js`: Builds failover chains, detects SPOFs **API Aggregation (Python FastAPI)** - `app.py`: RESTful API server on port 7860 - `test_aggregator.py`: Integration tests for all endpoints **Interactive Dashboard (Python Gradio)** - `app_gradio.py`: 5-tab dashboard with real-time monitoring - `config.py`: Loads resources from JSON registry - `monitor.py`: Async health checks with aiohttp - `database.py`: SQLite ORM with 5 tables - `scheduler.py`: Background monitoring every 5 minutes **Frontend** - `dashboard.html`: Standalone HTML dashboard for Node.js monitor **Configuration** - `all_apis_merged_2025.json`: Master registry with discovered API keys - `.env.example`: Template for 40+ environment variables --- ## 3. Architecture & Data Flow ### System Overview The project consists of **three independent but complementary systems**: ``` ┌─────────────────────────────────────────────────────────────────┐ │ CRYPTOCURRENCY API ECOSYSTEM │ │ (External: Etherscan, CoinGecko, Infura, NewsAPI, etc.) │ └─────────────────────┬───────────────────────────────────────────┘ │ ┌─────────────┼─────────────┐ │ │ │ ▼ ▼ ▼ ┌───────────────┐ ┌──────────────┐ ┌─────────────────────┐ │ Node.js │ │ FastAPI │ │ Gradio Dashboard │ │ Monitor │ │ Aggregator │ │ (Production UI) │ │ │ │ │ │ │ │ • Health │ │ • Query APIs │ │ • Real-time │ │ Checks │ │ • History │ │ Monitoring │ │ • Failover │ │ • Failover │ │ • 5 Tabs │ │ • Reports │ │ • CORS │ │ • SQLite │ │ │ │ │ │ • APScheduler │ └───────┬───────┘ └──────┬───────┘ └──────────┬──────────┘ │ │ │ ▼ ▼ ▼ api-monitor- history.db health_metrics.db report.json (SQLite) (SQLite) failover- config.json ``` ### Component Interaction **1. Node.js Health Monitor** (Standalone) ``` User/Cron → api-monitor.js → HTTPS Requests → APIs ↓ Status Classification ↓ JSON Report Export ↓ failover-manager.js → Failover Chains ↓ dashboard.html (Live View) ``` **2. FastAPI Aggregator** (Port 7860) ``` Client → POST /query → Resource Lookup → API Call → Response ↓ SQLite Logging ↓ History Tracking ``` **3. Gradio Dashboard** (Port 7860, HF Spaces) ``` User → Gradio UI → Tab Selection → Action ↓ ┌─────────────────┼─────────────────┐ ▼ ▼ ▼ Health Check View History Export Data ↓ ↓ ▼ Database Save Query SQLite CSV/JSON ↓ Update Visualizations ↓ Real-time Charts ``` ### Data Flow Examples **Example 1: Health Check Flow** ``` 1. User clicks "Run Health Check" in Gradio 2. monitor.check_all() → async tasks spawned 3. aiohttp.ClientSession → 50+ concurrent HTTPS requests 4. Responses collected → classified (ONLINE/DEGRADED/OFFLINE) 5. database.save_health_checks() → SQLite INSERT 6. Pandas DataFrame → Plotly charts 7. UI updates with status badges and response times ``` **Example 2: API Query Flow (FastAPI)** ``` 1. POST /query {"resource_type": "market_data", "resource_name": "coingecko"} 2. Load resource config from all_apis_merged_2025.json 3. Build URL: https://api.coingecko.com/api/v3/simple/price?... 4. aiohttp GET request with timeout (10s) 5. Response received → log_query() to SQLite 6. Return JSON: {"success": true, "data": {...}, "response_time": 0.234} ``` **Example 3: Background Scheduler** ``` 1. app_gradio.py startup → scheduler.start() 2. APScheduler triggers every 5 minutes 3. asyncio.run(monitor.check_all()) 4. Results → database.save_health_checks() 5. Tier-1 offline? → database.create_incident() 6. database.cleanup_old_data() → delete records >7 days ``` ### Real-Time Flows **WebSocket-like Updates** (Gradio auto-refresh) ``` Gradio Tab → Auto-refresh enabled (30s interval) → re-runs refresh_dashboard() → fetches latest from SQLite → re-renders Plotly charts ``` **Continuous Monitoring** (Node.js) ``` node api-monitor.js --continuous → setInterval(checkAll, 5 * 60 * 1000) → Updates JSON files every 5 minutes → dashboard.html polls api-monitor-report.json ``` --- ## 4. Local Development Runbook ### Prerequisites **Operating System** - ✅ Linux (Ubuntu 20.04+, Debian, RHEL) - ✅ macOS (11.0+) - ✅ Windows 10/11 (WSL2 recommended) **Required Runtimes** - **Node.js**: 14.0.0 or higher (for api-monitor.js) - Check: `node --version` - Install: https://nodejs.org/ - **Python**: 3.8 - 3.11 (tested on 3.11) - Check: `python3 --version` - Install: https://www.python.org/downloads/ **Optional Tools** - **Docker**: 20.10+ (for containerized deployment) - **Git**: 2.30+ (for version control) ### Installation Steps **Step 1: Clone Repository** ```bash git clone https://github.com/nimazasinich/crypto-dt-source.git cd crypto-dt-source ``` **Step 2: Set Up Node.js Monitor (Optional)** ```bash # No npm install needed - uses only Node.js built-in modules! # Verify Node.js is available node --version # Should show v14.0.0 or higher ``` **Step 3: Set Up Python Environment** ```bash # Create virtual environment (recommended) python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install Python dependencies pip install -r requirements.txt ``` **Step 4: Configure Environment Variables** ```bash # Copy the example file cp .env.example .env # Edit with your API keys (optional - most APIs work without keys) nano .env # or use your preferred editor ``` **Minimal .env for Testing** (all optional): ```env # Block Explorers (optional - fallback keys included in code) ETHERSCAN_KEY=your_key_here BSCSCAN_KEY=your_key_here # Market Data (CoinGecko is free, no key needed) CMC_KEY=your_coinmarketcap_key # Database DATABASE_PATH=data/health_metrics.db SCHEDULER_INTERVAL_MINUTES=5 ``` **Step 5: Initialize Database** (automatic on first run) ```bash # Database is created automatically when you first run the app # No manual initialization needed ``` ### Running the Applications **Option 1: Node.js Health Monitor** ```bash # Single health check node api-monitor.js # Continuous monitoring (every 5 minutes) node api-monitor.js --continuous # View results cat api-monitor-report.json | jq . # Run failover analysis node failover-manager.js # Start web dashboard (serves dashboard.html) npm run dashboard # Open: http://localhost:8080/dashboard.html ``` **Option 2: FastAPI Aggregator** ```bash # Start the FastAPI server python app.py # Server runs on: http://localhost:7860 # API docs available at: http://localhost:7860/docs # Interactive testing at: http://localhost:7860/redoc ``` **Option 3: Gradio Dashboard (Production UI)** ```bash # Start Gradio interface python app_gradio.py # Access at: http://localhost:7860 # Public URL generated automatically (if enabled) ``` **Option 4: Docker Deployment** ```bash # Build Docker image docker build -t crypto-api-monitor . # Run container docker run -p 7860:7860 \ -v $(pwd)/data:/app/data \ -e ETHERSCAN_KEY=your_key \ crypto-api-monitor # Access at: http://localhost:7860 ``` ### URLs to Open After starting each service: | Service | URL | Purpose | |---------|-----|---------| | Node.js Dashboard | http://localhost:8080/dashboard.html | HTML monitoring dashboard | | FastAPI Docs | http://localhost:7860/docs | Interactive API documentation | | FastAPI ReDoc | http://localhost:7860/redoc | Alternative API docs | | Gradio Interface | http://localhost:7860 | Full monitoring dashboard | | Health Check | http://localhost:7860/health | System health endpoint | ### Common Errors and Fixes **Error 1: "Module not found"** ```bash # Solution: Install Python dependencies pip install -r requirements.txt ``` **Error 2: "Port 7860 already in use"** ```bash # Solution: Kill existing process lsof -ti:7860 | xargs kill -9 # Or change port in app.py: # uvicorn.run(app, host="0.0.0.0", port=8080) ``` **Error 3: "Database locked"** ```bash # Solution: Close other connections to SQLite rm data/health_metrics.db # Database will be recreated on next run ``` **Error 4: "Failed to load resources"** ```bash # Solution: Ensure JSON files exist ls -lh all_apis_merged_2025.json # Should show 92K file ``` **Error 5: "Connection timeout" during health checks** ```bash # Solution: Increase timeout in config # In monitor.py, change: timeout=10 to timeout=30 ``` **Error 6: Node.js "Cannot find module 'https'"** ```bash # Solution: Use Node.js 14+ (https is built-in) node --version # If < 14, upgrade Node.js ``` ### Seed Data No seed data required - the system uses: - **all_apis_merged_2025.json**: Pre-configured with 162+ API endpoints - Real-time data fetched from live APIs - Database auto-creates on first run ### Verification Commands ```bash # Verify Python installation python3 --version && pip list | grep -E "(gradio|fastapi|aiohttp)" # Verify Node.js installation node --version && node -e "console.log('Node.js OK')" # Test FastAPI endpoints curl http://localhost:7860/health curl http://localhost:7860/resources # Test Gradio is running curl http://localhost:7860 | grep "gradio" # Check database sqlite3 data/health_metrics.db ".tables" # Should show: alerts, configuration, incidents, response_times, status_log # Verify monitoring output ls -lh api-monitor-report.json failover-config.json ``` --- ## 5. Configuration & Secrets ### Environment Variables Table | NAME | Required? | Default | Example | Used by | Purpose | Security Notes | |------|-----------|---------|---------|---------|---------|----------------| | **ETHERSCAN_KEY** | No | Hardcoded fallback | `SZHYFZK...` | api-monitor.js, config.py | Ethereum blockchain API access | Public tier OK, mask in logs | | **ETHERSCAN_BACKUP_KEY** | No | Hardcoded fallback | `T6IR8VJ...` | api-monitor.js, config.py | Failover Etherscan key | Provides redundancy | | **BSCSCAN_KEY** | No | Hardcoded fallback | `K62RKHG...` | api-monitor.js, config.py | BSC blockchain API | Free tier available | | **TRONSCAN_KEY** | No | Hardcoded fallback | `7ae7272...` | api-monitor.js, config.py | Tron blockchain API | UUID format | | **CMC_KEY** | No | Hardcoded fallback | `04cf4b5...` | app.py, config.py | CoinMarketCap API (333 calls/day free) | **Keep private**, has rate limits | | **CMC_BACKUP_KEY** | No | Hardcoded fallback | `b54bcf4...` | config.py | Backup CMC key | Rotate when primary exhausted | | **CRYPTOCOMPARE_KEY** | No | Hardcoded fallback | `e79c8e6...` | config.py | CryptoCompare API (100K/month free) | Free tier generous | | **NEWSAPI_KEY** | No | Hardcoded fallback | `pub_346...` | api-monitor.js, config.py | News aggregation | Public data OK | | **INFURA_KEY** | No | None | `9aa3d95...` | .env.example | Ethereum RPC node (100K/day free) | **Keep private** | | **ALCHEMY_KEY** | No | None | `demo_key` | .env.example | Ethereum RPC (300M compute units/month) | **Keep private** | | **DATABASE_PATH** | No | `data/health_metrics.db` | `data/health_metrics.db` | database.py | SQLite file location | Ensure write permissions | | **DATABASE_RETENTION_DAYS** | No | `7` | `7` | database.py | Auto-cleanup threshold | Balance storage vs history | | **SCHEDULER_INTERVAL_MINUTES** | No | `5` | `5` | scheduler.py | Health check frequency | Lower = more API calls | | **SCHEDULER_MAX_CONCURRENT** | No | `10` | `10` | monitor.py | Parallel request limit | Prevent rate limiting | | **SCHEDULER_TIMEOUT_SECONDS** | No | `10` | `10` | monitor.py | HTTP request timeout | Increase if slow networks | | **CACHE_TTL_SECONDS** | No | `60` | `60` | monitor.py | Result cache duration | Reduce API calls | | **CACHE_ENABLED** | No | `true` | `true` | monitor.py | Enable caching | Set to `false` for real-time | | **LOG_LEVEL** | No | `INFO` | `INFO` / `DEBUG` | All Python modules | Logging verbosity | DEBUG for troubleshooting | | **LOG_FORMAT** | No | Standard | `%(asctime)s - %(message)s` | All Python modules | Log message format | Customize as needed | | **HF_SPACE_NAME** | No | None | `crypto-api-monitor` | .env.example | Hugging Face Space identifier | For HF deployment only | | **HF_USERNAME** | No | None | `your_username` | .env.example | Hugging Face username | For HF deployment only | | **HF_AUTO_REFRESH_SECONDS** | No | `30` | `30` | .env.example | Dashboard auto-refresh | Balance UX vs load | | **ENABLE_BACKGROUND_SCHEDULER** | No | `true` | `true` | app_gradio.py | Enable APScheduler | Disable for manual checks | | **ENABLE_INCIDENT_DETECTION** | No | `true` | `true` | scheduler.py | Auto-create incidents | Tier-1 outage alerts | | **ENABLE_ALERT_SYSTEM** | No | `true` | `true` | scheduler.py | Alert notifications | For critical failures | | **ENABLE_DATA_EXPORT** | No | `true` | `true` | app_gradio.py | CSV/JSON export | For data analysis | ### Where to Put Variables **Option 1: .env File (Local Development)** ```bash # Copy template cp .env.example .env # Edit with your keys nano .env ``` **Option 2: Environment Export (CLI)** ```bash export ETHERSCAN_KEY="your_key_here" export CMC_KEY="your_cmc_key" python app_gradio.py ``` **Option 3: Docker Environment** ```bash docker run -p 7860:7860 \ -e ETHERSCAN_KEY="your_key" \ -e CMC_KEY="your_cmc_key" \ crypto-api-monitor ``` **Option 4: Hugging Face Secrets (Production)** 1. Go to your Space Settings 2. Navigate to "Repository Secrets" 3. Add each key individually: - Name: `ETHERSCAN_KEY` - Value: `your_actual_key` - Save ### How to Generate Values Safely **Etherscan API Key** (Free) ``` 1. Visit: https://etherscan.io/register 2. Verify email 3. Go to: https://etherscan.io/myapikey 4. Create new API key 5. Free tier: 5 calls/second, 100K calls/day ``` **CoinMarketCap API Key** (Free tier) ``` 1. Visit: https://pro.coinmarketcap.com/signup 2. Select "Basic" plan (free) 3. Verify email 4. Dashboard → API Key → Copy 5. Free tier: 333 calls/day, 10K calls/month ``` **Infura Project ID** (Free) ``` 1. Visit: https://infura.io/register 2. Create account 3. Create new project → Ethereum 4. Copy "Project ID" (32 hex chars) 5. Free tier: 100K requests/day ``` **NewsAPI Key** (Free) ``` 1. Visit: https://newsapi.org/register 2. Fill form and verify email 3. Copy API key from dashboard 4. Free tier: 100 requests/day ``` ### Security Notes **API Key Handling** - ✅ Keys are **masked in logs**: First 4 + last 4 chars only - ✅ Never commit `.env` to git (in `.gitignore`) - ✅ Use environment variables in production - ⚠️ Hardcoded fallback keys in code are **public tier** - safe to use but limited **Rate Limiting** - Monitor enforces delays between requests - Scheduler respects `MAX_CONCURRENT` setting - CORS proxies have their own limits (documented in code) **Best Practices** 1. Rotate keys every 90 days 2. Use separate keys for dev/staging/prod 3. Enable key usage alerts in provider dashboards 4. Monitor rate limit consumption via `/history/stats` 5. Use backup keys for critical APIs (CMC, Etherscan) --- ## 6. APIs & Contracts (REST/GraphQL/WS) ### API Endpoints Table #### Node.js Health Monitor (No HTTP Server) The Node.js monitor is a CLI tool that outputs JSON files. Access via: ```bash # Run and read output node api-monitor.js cat api-monitor-report.json # Serve via Python HTTP server python3 -m http.server 8080 # GET http://localhost:8080/api-monitor-report.json ``` #### FastAPI Aggregator (Port 7860) | Method | Path | Parameters | Sample Request | Sample Response | Error Shapes | |--------|------|------------|----------------|-----------------|--------------| | **GET** | `/` | None | `curl http://localhost:7860/` | `{"name": "Crypto Resource Aggregator", "version": "1.0.0", "endpoints": {...}}` | N/A | | **GET** | `/health` | None | `curl http://localhost:7860/health` | `{"status": "healthy", "timestamp": "2025-11-10T...", "resources_loaded": true}` | N/A | | **GET** | `/resources` | None | `curl http://localhost:7860/resources` | `{"total_categories": 7, "resources": {"block_explorers": ["etherscan", "bscscan"], ...}}` | N/A | | **GET** | `/resources/{category}` | `category` (path) | `curl http://localhost:7860/resources/market_data` | `{"category": "market_data", "resources": {...}, "count": 5}` | `404: Category not found` | | **POST** | `/query` | JSON body | See below | See below | `404: Resource not found` | | **GET** | `/status` | None | `curl http://localhost:7860/status` | `{"total_resources": 15, "online": 13, "offline": 2, "resources": [...]}` | N/A | | **GET** | `/status/{category}/{name}` | `category`, `name` (path) | `curl http://localhost:7860/status/market_data/coingecko` | `{"resource": "market_data.coingecko", "status": "online", "response_time": 0.123}` | `404: Resource not found` | | **GET** | `/history` | `limit` (query, int), `resource_type` (query, optional) | `curl http://localhost:7860/history?limit=50` | `{"count": 50, "history": [{...}]}` | N/A | | **GET** | `/history/stats` | None | `curl http://localhost:7860/history/stats` | `{"total_queries": 1523, "success_rate": 97.6, "most_queried_resources": [...]}` | N/A | **POST /query - Detailed Example** Request: ```bash curl -X POST http://localhost:7860/query \ -H "Content-Type: application/json" \ -d '{ "resource_type": "market_data", "resource_name": "coingecko", "endpoint": "/simple/price", "params": { "ids": "bitcoin,ethereum", "vs_currencies": "usd,eur" } }' ``` Response (Success): ```json { "success": true, "resource_type": "market_data", "resource_name": "coingecko", "data": { "bitcoin": { "usd": 45000, "eur": 42000 }, "ethereum": { "usd": 3000, "eur": 2800 } }, "response_time": 0.234, "timestamp": "2025-11-10T14:30:00.000Z" } ``` Response (Error): ```json { "success": false, "resource_type": "market_data", "resource_name": "coinmarketcap", "error": "HTTP 429 - Rate limit exceeded", "response_time": 0.156, "timestamp": "2025-11-10T14:30:00.000Z" } ``` #### Gradio Interface (Port 7860) Gradio provides a web UI, not RESTful API. Accessible via: - **Direct access**: http://localhost:7860 - **Tabs**: Dashboard, Analytics, History, Incidents, Settings - **Actions**: Button clicks, dropdowns, sliders (not HTTP endpoints) ### Event/Message Schemas **N/A** - This project does not use queues or WebSockets. All communication is HTTP request/response. ### Error Response Format **Standard Error Shape (FastAPI)** ```json { "detail": "Category 'invalid_category' not found" } ``` **HTTP Status Codes Used** - `200 OK`: Successful request - `404 Not Found`: Resource/category not found - `422 Unprocessable Entity`: Invalid request body (Pydantic validation) - `500 Internal Server Error`: Unexpected server error --- ## 7. Data Storage & Migrations ### Database Engines **SQLite 3** - Used for both `history.db` (FastAPI) and `health_metrics.db` (Gradio) - File-based, no separate server needed - Concurrent reads, sequential writes - ACID compliant ### Connection Strings **FastAPI (history.db)** ```python conn = sqlite3.connect('history.db') # No password, local file ``` **Gradio (health_metrics.db)** ```python db_path = Path("data/health_metrics.db") conn = sqlite3.connect(db_path) # Configured via DATABASE_PATH env var ``` ### Schema Overview #### Database: `history.db` (FastAPI) **Table: query_history** ```sql CREATE TABLE IF NOT EXISTS query_history ( id INTEGER PRIMARY KEY AUTOINCREMENT, timestamp DATETIME DEFAULT CURRENT_TIMESTAMP, resource_type TEXT NOT NULL, resource_name TEXT NOT NULL, endpoint TEXT NOT NULL, status TEXT NOT NULL, -- 'success' or 'error' response_time REAL, -- in seconds error_message TEXT ); ``` Purpose: Logs every API query made through the aggregator **Table: resource_status** ```sql CREATE TABLE IF NOT EXISTS resource_status ( id INTEGER PRIMARY KEY AUTOINCREMENT, resource_name TEXT NOT NULL UNIQUE, last_check DATETIME DEFAULT CURRENT_TIMESTAMP, status TEXT NOT NULL, -- 'online' or 'offline' consecutive_failures INTEGER DEFAULT 0, last_success DATETIME, last_error TEXT ); ``` Purpose: Tracks current status of each resource #### Database: `health_metrics.db` (Gradio) **Table: status_log** ```sql CREATE TABLE IF NOT EXISTS status_log ( id INTEGER PRIMARY KEY AUTOINCREMENT, provider_name TEXT NOT NULL, category TEXT NOT NULL, status TEXT NOT NULL, -- 'online', 'degraded', 'offline' response_time REAL, -- in milliseconds status_code INTEGER, error_message TEXT, endpoint_tested TEXT, timestamp REAL NOT NULL, -- Unix epoch created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); CREATE INDEX idx_status_log_provider ON status_log(provider_name, timestamp); CREATE INDEX idx_status_log_timestamp ON status_log(timestamp); ``` Purpose: Historical log of all health checks **Table: response_times** ```sql CREATE TABLE IF NOT EXISTS response_times ( id INTEGER PRIMARY KEY AUTOINCREMENT, provider_name TEXT NOT NULL, avg_response_time REAL NOT NULL, min_response_time REAL NOT NULL, max_response_time REAL NOT NULL, sample_count INTEGER NOT NULL, period_start TIMESTAMP NOT NULL, period_end TIMESTAMP NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); ``` Purpose: Aggregated response time statistics (1-hour periods) **Table: incidents** ```sql CREATE TABLE IF NOT EXISTS incidents ( id INTEGER PRIMARY KEY AUTOINCREMENT, provider_name TEXT NOT NULL, category TEXT NOT NULL, incident_type TEXT NOT NULL, -- 'service_offline', 'degraded', etc. description TEXT, severity TEXT, -- 'low', 'medium', 'high' start_time TIMESTAMP NOT NULL, end_time TIMESTAMP, duration_seconds INTEGER, resolved BOOLEAN DEFAULT 0, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); CREATE INDEX idx_incidents_provider ON incidents(provider_name, start_time); ``` Purpose: Tracks service outages and incidents **Table: alerts** ```sql CREATE TABLE IF NOT EXISTS alerts ( id INTEGER PRIMARY KEY AUTOINCREMENT, provider_name TEXT NOT NULL, alert_type TEXT NOT NULL, -- 'tier1_offline', 'high_latency', etc. message TEXT, threshold_value REAL, actual_value REAL, triggered_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, acknowledged BOOLEAN DEFAULT 0 ); ``` Purpose: Alert notifications for critical issues **Table: configuration** ```sql CREATE TABLE IF NOT EXISTS configuration ( key TEXT PRIMARY KEY, value TEXT NOT NULL, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); ``` Purpose: Store runtime configuration settings ### Migrations **No Migration System** - Tables are created automatically on first run via: ```python # database.py - _init_database() method def _init_database(self): with self.get_connection() as conn: cursor = conn.cursor() cursor.execute("CREATE TABLE IF NOT EXISTS status_log (...)") # ... creates all tables ``` **How to Apply** - Automatic on first app startup - Database file created if not exists - Schema upgraded via `CREATE TABLE IF NOT EXISTS` **How to Rollback** ```bash # Delete database file rm data/health_metrics.db # App will recreate on next run ``` **Schema Changes** To add columns: ```python # In database.py _init_database() cursor.execute("ALTER TABLE status_log ADD COLUMN new_field TEXT") ``` ### Data Retention **Automatic Cleanup** (scheduler.py) ```python # Runs every scheduler cycle self.database.cleanup_old_data(days=7) ``` - Deletes `status_log` records older than 7 days - Deletes resolved incidents older than 7 days - Deletes acknowledged alerts older than 7 days - Configurable via `DATABASE_RETENTION_DAYS` env var **Manual Cleanup** ```bash sqlite3 data/health_metrics.db > DELETE FROM status_log WHERE created_at < datetime('now', '-30 days'); > VACUUM; ``` --- ## 8. Frontend Structure & Conventions ### Build System **Node.js Monitor Dashboard** - **Framework**: None (vanilla HTML/CSS/JavaScript) - **Build**: Not required - `dashboard.html` is served directly - **Server**: `python3 -m http.server 8080` or `npm run dashboard` **Gradio Interface** - **Framework**: Gradio 4.14.0 - **Build**: None (Gradio handles compilation internally) - **Components**: Pre-built Gradio components (gr.DataFrame, gr.Plot, gr.Button, etc.) ### Routing **dashboard.html** (No routing - single page) - All content in one HTML file - JavaScript handles dynamic updates - Fetches `api-monitor-report.json` via AJAX **Gradio** (Tab-based navigation) ```python with gr.Blocks(theme=gr.themes.Soft()) as app: with gr.Tab("Dashboard"): # Dashboard components with gr.Tab("Analytics"): # Analytics components # ... 5 tabs total ``` ### State Management **dashboard.html** - No formal state management - DOM updates via vanilla JavaScript - Global variables for current report **Gradio** - Component state managed by Gradio framework - Global variables for shared state: ```python current_results = [] # Latest health check results last_check_time = None ``` - Database serves as persistent state store ### Theming **dashboard.html** ```css /* Gradient background */ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); /* Card shadows */ box-shadow: 0 10px 30px rgba(0,0,0,0.2); /* Status colors */ .online { color: #10b981; } .degraded { color: #f59e0b; } .offline { color: #ef4444; } ``` **Gradio** ```python gr.Blocks(theme=gr.themes.Soft()) # Uses Gradio's Soft theme # Custom CSS can be added via css= parameter ``` ### Component Conventions **dashboard.html** - BEM-like naming: `.stat-card`, `.category-section` - Status badges: 🟢 🟡 🔴 (emoji) - Responsive grid: `grid-template-columns: repeat(auto-fit, minmax(250px, 1fr))` **Gradio Components** ```python # Naming convention: {purpose}_{type} status_df = gr.DataFrame(label="Resource Status") refresh_btn = gr.Button("Refresh", variant="primary") category_dropdown = gr.Dropdown(choices=["All", ...], value="All") ``` ### Where to Add Features **New Tab in Gradio** ```python # In app_gradio.py, after existing tabs with gr.Tab("Your New Tab"): with gr.Column(): gr.Markdown("## Your Feature") # Add components your_output = gr.Textbox() your_button = gr.Button("Action") # Wire up event handler your_button.click( fn=your_function, inputs=[], outputs=[your_output] ) ``` **New Chart in Analytics** ```python # In app_gradio.py, create_analytics_charts() function def create_analytics_charts(): # ... existing charts ... # Add new chart fig_new = px.bar(data, x='category', y='value', title="New Metric") return fig_uptime, fig_response, fig_new # Add to return tuple # Update outputs in analytics_tab analytics_btn.click( fn=create_analytics_charts, outputs=[uptime_chart, response_chart, new_chart] # Add new output ) ``` **New Section in dashboard.html** ```html