| # Charts Validation & Hardening Documentation | |
| ## Overview | |
| This document provides comprehensive documentation for the newly implemented chart endpoints with validation and security hardening. | |
| ## New Endpoints | |
| ### 1. `/api/charts/rate-limit-history` | |
| **Purpose:** Retrieve hourly rate limit usage history for visualization in charts. | |
| **Method:** `GET` | |
| **Parameters:** | |
| | Parameter | Type | Required | Default | Constraints | Description | | |
| |-----------|------|----------|---------|-------------|-------------| | |
| | `hours` | integer | No | 24 | 1-168 | Hours of history to retrieve (clamped server-side) | | |
| | `providers` | string | No | top 5 | max 5, comma-separated | Provider names to include | | |
| **Response Schema:** | |
| ```json | |
| [ | |
| { | |
| "provider": "coingecko", | |
| "hours": 24, | |
| "series": [ | |
| { | |
| "t": "2025-11-10T13:00:00Z", | |
| "pct": 42.5 | |
| }, | |
| { | |
| "t": "2025-11-10T14:00:00Z", | |
| "pct": 38.2 | |
| } | |
| ], | |
| "meta": { | |
| "limit_type": "per_minute", | |
| "limit_value": 30 | |
| } | |
| } | |
| ] | |
| ``` | |
| **Response Fields:** | |
| - `provider` (string): Provider name | |
| - `hours` (integer): Number of hours covered | |
| - `series` (array): Time series data points | |
| - `t` (string): ISO 8601 timestamp with 'Z' suffix | |
| - `pct` (number): Rate limit usage percentage [0-100] | |
| - `meta` (object): Rate limit metadata | |
| - `limit_type` (string): Type of limit (per_second, per_minute, per_hour, per_day) | |
| - `limit_value` (integer|null): Limit value, null if no limit configured | |
| **Behavior:** | |
| - Returns one series object per provider | |
| - Each series contains exactly `hours` data points (one per hour) | |
| - Hours without data are filled with `pct: 0.0` | |
| - If provider has no rate limit configured, returns `meta.limit_value: null` and `pct: 0` | |
| - Default: Returns up to 5 providers with configured rate limits | |
| - Series ordered chronologically (oldest to newest) | |
| **Examples:** | |
| ```bash | |
| # Default: Last 24 hours, top 5 providers | |
| curl "http://localhost:7860/api/charts/rate-limit-history" | |
| # Custom: 48 hours, specific providers | |
| curl "http://localhost:7860/api/charts/rate-limit-history?hours=48&providers=coingecko,cmc,etherscan" | |
| # Single provider, 1 week | |
| curl "http://localhost:7860/api/charts/rate-limit-history?hours=168&providers=binance" | |
| ``` | |
| **Error Responses:** | |
| - `400 Bad Request`: Invalid provider name | |
| ```json | |
| { | |
| "detail": "Invalid provider name: invalid_xyz. Must be one of: ..." | |
| } | |
| ``` | |
| - `422 Unprocessable Entity`: Invalid parameter type | |
| - `500 Internal Server Error`: Database or processing error | |
| --- | |
| ### 2. `/api/charts/freshness-history` | |
| **Purpose:** Retrieve hourly data freshness/staleness history for visualization. | |
| **Method:** `GET` | |
| **Parameters:** | |
| | Parameter | Type | Required | Default | Constraints | Description | | |
| |-----------|------|----------|---------|-------------|-------------| | |
| | `hours` | integer | No | 24 | 1-168 | Hours of history to retrieve (clamped server-side) | | |
| | `providers` | string | No | top 5 | max 5, comma-separated | Provider names to include | | |
| **Response Schema:** | |
| ```json | |
| [ | |
| { | |
| "provider": "coingecko", | |
| "hours": 24, | |
| "series": [ | |
| { | |
| "t": "2025-11-10T13:00:00Z", | |
| "staleness_min": 7.2, | |
| "ttl_min": 15, | |
| "status": "fresh" | |
| }, | |
| { | |
| "t": "2025-11-10T14:00:00Z", | |
| "staleness_min": 999.0, | |
| "ttl_min": 15, | |
| "status": "stale" | |
| } | |
| ], | |
| "meta": { | |
| "category": "market_data", | |
| "default_ttl": 1 | |
| } | |
| } | |
| ] | |
| ``` | |
| **Response Fields:** | |
| - `provider` (string): Provider name | |
| - `hours` (integer): Number of hours covered | |
| - `series` (array): Time series data points | |
| - `t` (string): ISO 8601 timestamp with 'Z' suffix | |
| - `staleness_min` (number): Data staleness in minutes (999.0 indicates no data) | |
| - `ttl_min` (integer): TTL threshold for this provider's category | |
| - `status` (string): Derived status: "fresh", "aging", or "stale" | |
| - `meta` (object): Provider metadata | |
| - `category` (string): Provider category | |
| - `default_ttl` (integer): Default TTL for category (minutes) | |
| **Status Derivation:** | |
| ``` | |
| fresh: staleness_min <= ttl_min | |
| aging: ttl_min < staleness_min <= ttl_min * 2 | |
| stale: staleness_min > ttl_min * 2 OR no data (999.0) | |
| ``` | |
| **TTL by Category:** | |
| | Category | TTL (minutes) | | |
| |----------|---------------| | |
| | market_data | 1 | | |
| | blockchain_explorers | 5 | | |
| | defi | 10 | | |
| | news | 15 | | |
| | default | 5 | | |
| **Behavior:** | |
| - Returns one series object per provider | |
| - Each series contains exactly `hours` data points (one per hour) | |
| - Hours without data are marked with `staleness_min: 999.0` and `status: "stale"` | |
| - Default: Returns up to 5 most active providers | |
| - Series ordered chronologically (oldest to newest) | |
| **Examples:** | |
| ```bash | |
| # Default: Last 24 hours, top 5 providers | |
| curl "http://localhost:7860/api/charts/freshness-history" | |
| # Custom: 72 hours, specific providers | |
| curl "http://localhost:7860/api/charts/freshness-history?hours=72&providers=coingecko,binance" | |
| # Single provider, 3 days | |
| curl "http://localhost:7860/api/charts/freshness-history?hours=72&providers=etherscan" | |
| ``` | |
| **Error Responses:** | |
| - `400 Bad Request`: Invalid provider name | |
| - `422 Unprocessable Entity`: Invalid parameter type | |
| - `500 Internal Server Error`: Database or processing error | |
| --- | |
| ## Security & Validation | |
| ### Input Validation | |
| 1. **Hours Parameter:** | |
| - Server-side clamping: `1 <= hours <= 168` | |
| - Invalid types rejected with `422 Unprocessable Entity` | |
| - Out-of-range values automatically clamped (no error) | |
| 2. **Providers Parameter:** | |
| - Allow-list enforcement: Only valid provider names accepted | |
| - Max 5 providers enforced (excess silently truncated) | |
| - Invalid names trigger `400 Bad Request` with detailed error | |
| - SQL injection prevention: No raw SQL, parameterized queries only | |
| - XSS prevention: Input sanitized (strip whitespace) | |
| 3. **Rate Limiting (Recommended):** | |
| - Implement: 60 requests/minute per IP for chart routes | |
| - Use middleware or reverse proxy (nginx/cloudflare) | |
| ### Security Measures Implemented | |
| ✓ Allow-list validation for provider names | |
| ✓ Parameter clamping (hours: 1-168) | |
| ✓ Max provider limit (5) | |
| ✓ SQL injection prevention (ORM with parameterized queries) | |
| ✓ XSS prevention (input sanitization) | |
| ✓ Comprehensive error handling with safe error messages | |
| ✓ Logging of all chart requests for monitoring | |
| ✓ No sensitive data exposure in responses | |
| ### Edge Cases Handled | |
| - Empty provider list → Returns default providers | |
| - Unknown provider → 400 with valid options listed | |
| - Hours out of bounds → Clamped to [1, 168] | |
| - No data available → Returns empty series or 999.0 staleness | |
| - Provider with no rate limit → Returns null limit_value | |
| - Whitespace in provider names → Trimmed automatically | |
| - Mixed valid/invalid providers → Rejects entire request | |
| --- | |
| ## Testing | |
| ### Automated Tests | |
| Run the comprehensive test suite: | |
| ```bash | |
| # Run all chart tests | |
| pytest tests/test_charts.py -v | |
| # Run specific test class | |
| pytest tests/test_charts.py::TestRateLimitHistory -v | |
| # Run with coverage | |
| pytest tests/test_charts.py --cov=api --cov-report=html | |
| ``` | |
| **Test Coverage:** | |
| - ✓ Default parameter behavior | |
| - ✓ Custom time ranges (48h, 72h) | |
| - ✓ Provider selection and filtering | |
| - ✓ Response schema validation | |
| - ✓ Percentage range validation [0-100] | |
| - ✓ Timestamp format validation | |
| - ✓ Status derivation logic | |
| - ✓ Edge cases (invalid providers, hours clamping) | |
| - ✓ Security (SQL injection, XSS prevention) | |
| - ✓ Performance (response time < 500ms) | |
| - ✓ Concurrent request handling | |
| ### Manual Sanity Checks | |
| Run the CLI sanity check script: | |
| ```bash | |
| # Ensure backend is running | |
| python app.py & | |
| # Run sanity checks | |
| ./tests/sanity_checks.sh | |
| ``` | |
| **Checks performed:** | |
| 1. Rate limit history (default params) | |
| 2. Freshness history (default params) | |
| 3. Custom time ranges | |
| 4. Response schema validation | |
| 5. Invalid provider rejection | |
| 6. Hours parameter clamping | |
| 7. Performance measurement | |
| 8. Edge case handling | |
| --- | |
| ## Performance Targets | |
| ### Response Time (P95) | |
| | Environment | Target | Conditions | | |
| |-------------|--------|------------| | |
| | Production | < 200ms | 24h / 5 providers | | |
| | Development | < 500ms | 24h / 5 providers | | |
| ### Optimization Strategies | |
| 1. **Database Indexing:** | |
| - Indexed: `timestamp`, `provider_id` columns | |
| - Composite indexes on frequently queried combinations | |
| 2. **Query Optimization:** | |
| - Hourly bucketing done in-memory (fast) | |
| - Limited to 168 hours max (1 week) | |
| - Provider limit enforced early (max 5) | |
| 3. **Caching (Future Enhancement):** | |
| - Consider Redis cache for 1-minute TTL | |
| - Cache key: `chart:type:hours:providers` | |
| - Invalidate on new data ingestion | |
| 4. **Connection Pooling:** | |
| - SQLAlchemy pool size: 10 | |
| - Max overflow: 20 | |
| - Recycle connections every 3600s | |
| --- | |
| ## Observability & Monitoring | |
| ### Logging | |
| All chart requests are logged with: | |
| ```json | |
| { | |
| "timestamp": "2025-11-11T01:00:00Z", | |
| "level": "INFO", | |
| "logger": "api_endpoints", | |
| "message": "Rate limit history: 3 providers, 48h" | |
| } | |
| ``` | |
| ### Recommended Metrics (Prometheus/Grafana) | |
| ```python | |
| # Counter: Total requests per endpoint | |
| chart_requests_total{endpoint="rate_limit_history"} 1523 | |
| # Histogram: Response time distribution | |
| chart_response_time_seconds{endpoint="rate_limit_history", le="0.1"} 1450 | |
| chart_response_time_seconds{endpoint="rate_limit_history", le="0.2"} 1510 | |
| # Gauge: Current rate limit usage per provider | |
| ratelimit_usage_pct{provider="coingecko"} 87.5 | |
| # Gauge: Freshness staleness per provider | |
| freshness_staleness_min{provider="binance"} 3.2 | |
| # Counter: Invalid request count | |
| chart_invalid_requests_total{endpoint="rate_limit_history", reason="invalid_provider"} 23 | |
| ``` | |
| ### Recommended Alerts | |
| ```yaml | |
| # Critical: Rate limit exhaustion | |
| - alert: RateLimitExhaustion | |
| expr: ratelimit_usage_pct > 90 | |
| for: 3h | |
| annotations: | |
| summary: "Provider {{ $labels.provider }} at {{ $value }}% rate limit" | |
| action: "Add API keys or reduce request frequency" | |
| # Critical: Data staleness | |
| - alert: DataStale | |
| expr: freshness_staleness_min > ttl_min | |
| for: 15m | |
| annotations: | |
| summary: "Provider {{ $labels.provider }} data is stale ({{ $value }}m old)" | |
| action: "Check scheduler, verify API connectivity" | |
| # Warning: Chart endpoint slow | |
| - alert: ChartEndpointSlow | |
| expr: histogram_quantile(0.95, chart_response_time_seconds) > 0.2 | |
| for: 10m | |
| annotations: | |
| summary: "Chart endpoint P95 latency above 200ms" | |
| action: "Check database query performance" | |
| ``` | |
| --- | |
| ## Database Schema | |
| ### Tables Used | |
| **RateLimitUsage** | |
| ```sql | |
| CREATE TABLE rate_limit_usage ( | |
| id INTEGER PRIMARY KEY, | |
| timestamp DATETIME NOT NULL, -- INDEXED | |
| provider_id INTEGER NOT NULL, -- FOREIGN KEY, INDEXED | |
| limit_type VARCHAR(20), | |
| limit_value INTEGER, | |
| current_usage INTEGER, | |
| percentage REAL, | |
| reset_time DATETIME | |
| ); | |
| ``` | |
| **DataCollection** | |
| ```sql | |
| CREATE TABLE data_collection ( | |
| id INTEGER PRIMARY KEY, | |
| provider_id INTEGER NOT NULL, -- FOREIGN KEY, INDEXED | |
| actual_fetch_time DATETIME NOT NULL, | |
| data_timestamp DATETIME, | |
| staleness_minutes REAL, | |
| record_count INTEGER, | |
| on_schedule BOOLEAN | |
| ); | |
| ``` | |
| --- | |
| ## Frontend Integration | |
| ### Chart.js Example (Rate Limit) | |
| ```javascript | |
| // Fetch rate limit history | |
| const response = await fetch('/api/charts/rate-limit-history?hours=48&providers=coingecko,cmc'); | |
| const data = await response.json(); | |
| // Build Chart.js dataset | |
| const datasets = data.map(series => ({ | |
| label: series.provider, | |
| data: series.series.map(p => ({ | |
| x: new Date(p.t), | |
| y: p.pct | |
| })), | |
| borderColor: getColorForProvider(series.provider), | |
| tension: 0.3 | |
| })); | |
| // Create chart | |
| new Chart(ctx, { | |
| type: 'line', | |
| data: { datasets }, | |
| options: { | |
| scales: { | |
| x: { type: 'time', time: { unit: 'hour' } }, | |
| y: { min: 0, max: 100, title: { text: 'Usage %' } } | |
| }, | |
| interaction: { mode: 'index', intersect: false }, | |
| plugins: { | |
| legend: { display: true, position: 'bottom' }, | |
| tooltip: { | |
| callbacks: { | |
| label: ctx => `${ctx.dataset.label}: ${ctx.parsed.y.toFixed(1)}%` | |
| } | |
| } | |
| } | |
| } | |
| }); | |
| ``` | |
| ### Chart.js Example (Freshness) | |
| ```javascript | |
| // Fetch freshness history | |
| const response = await fetch('/api/charts/freshness-history?hours=72&providers=binance'); | |
| const data = await response.json(); | |
| // Build datasets with status-based colors | |
| const datasets = data.map(series => ({ | |
| label: series.provider, | |
| data: series.series.map(p => ({ | |
| x: new Date(p.t), | |
| y: p.staleness_min, | |
| status: p.status | |
| })), | |
| borderColor: getColorForProvider(series.provider), | |
| segment: { | |
| borderColor: ctx => { | |
| const point = ctx.p1.$context.raw; | |
| return point.status === 'fresh' ? 'green' | |
| : point.status === 'aging' ? 'orange' | |
| : 'red'; | |
| } | |
| } | |
| })); | |
| // Create chart with TTL reference line | |
| new Chart(ctx, { | |
| type: 'line', | |
| data: { datasets }, | |
| options: { | |
| scales: { | |
| x: { type: 'time' }, | |
| y: { title: { text: 'Staleness (min)' } } | |
| }, | |
| plugins: { | |
| annotation: { | |
| annotations: { | |
| ttl: { | |
| type: 'line', | |
| yMin: data[0].meta.default_ttl, | |
| yMax: data[0].meta.default_ttl, | |
| borderColor: 'rgba(255, 99, 132, 0.5)', | |
| borderWidth: 2, | |
| label: { content: 'TTL Threshold', enabled: true } | |
| } | |
| } | |
| } | |
| } | |
| } | |
| }); | |
| ``` | |
| --- | |
| ## Troubleshooting | |
| ### Common Issues | |
| **1. Empty series returned** | |
| - Check if providers have data in the time range | |
| - Verify provider names are correct (case-sensitive) | |
| - Ensure database has historical data | |
| **2. Response time > 500ms** | |
| - Check database indexes exist | |
| - Reduce `hours` parameter | |
| - Limit number of providers | |
| - Consider adding caching layer | |
| **3. 400 Bad Request on valid provider** | |
| - Verify provider is in database: `SELECT name FROM providers` | |
| - Check for typos or case mismatch | |
| - Ensure provider has not been renamed | |
| **4. Missing data points (gaps in series)** | |
| - Normal behavior: gaps filled with zeros/999.0 | |
| - Check data collection scheduler is running | |
| - Review logs for collection failures | |
| --- | |
| ## Changelog | |
| ### v1.0.0 - 2025-11-11 | |
| **Added:** | |
| - `/api/charts/rate-limit-history` endpoint | |
| - `/api/charts/freshness-history` endpoint | |
| - Comprehensive input validation | |
| - Security hardening (allow-list, clamping, sanitization) | |
| - Automated test suite (pytest) | |
| - CLI sanity check script | |
| - Full API documentation | |
| **Security:** | |
| - SQL injection prevention | |
| - XSS prevention | |
| - Parameter validation and clamping | |
| - Allow-list enforcement for providers | |
| - Max provider limit (5) | |
| **Testing:** | |
| - 20+ automated tests | |
| - Schema validation tests | |
| - Security tests | |
| - Performance tests | |
| - Edge case coverage | |
| --- | |
| ## Future Enhancements | |
| ### Phase 2 (Optional) | |
| 1. **Provider Picker UI Component** | |
| - Dropdown with multi-select (max 5) | |
| - Persist selection in localStorage | |
| - Auto-refresh on selection change | |
| 2. **Advanced Filtering** | |
| - Filter by category | |
| - Filter by rate limit status (ok/warning/critical) | |
| - Filter by freshness status (fresh/aging/stale) | |
| 3. **Aggregation Options** | |
| - Category-level aggregation | |
| - System-wide average/percentile | |
| - Compare providers side-by-side | |
| 4. **Export Functionality** | |
| - CSV export | |
| - JSON export | |
| - PNG/SVG chart export | |
| 5. **Real-time Updates** | |
| - WebSocket streaming for live updates | |
| - Auto-refresh without flicker | |
| - Smooth transitions on new data | |
| 6. **Historical Analysis** | |
| - Trend detection (improving/degrading) | |
| - Anomaly detection | |
| - Predictive alerts | |
| --- | |
| ## Support & Maintenance | |
| ### Code Location | |
| - Endpoints: `api/endpoints.py` (lines 947-1250) | |
| - Tests: `tests/test_charts.py` | |
| - Sanity checks: `tests/sanity_checks.sh` | |
| - Documentation: `CHARTS_VALIDATION_DOCUMENTATION.md` | |
| ### Contact | |
| For issues or questions: | |
| - Create GitHub issue with `[charts]` prefix | |
| - Tag: `enhancement`, `bug`, or `documentation` | |
| - Provide: Request details, expected vs actual behavior, logs | |
| --- | |
| ## License | |
| Same as parent project. | |