File size: 16,313 Bytes
48ae4e0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
# Charts Validation & Hardening Documentation

## Overview

This document provides comprehensive documentation for the newly implemented chart endpoints with validation and security hardening.

## New Endpoints

### 1. `/api/charts/rate-limit-history`

**Purpose:** Retrieve hourly rate limit usage history for visualization in charts.

**Method:** `GET`

**Parameters:**

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| `hours` | integer | No | 24 | 1-168 | Hours of history to retrieve (clamped server-side) |
| `providers` | string | No | top 5 | max 5, comma-separated | Provider names to include |

**Response Schema:**

```json
[
  {
    "provider": "coingecko",
    "hours": 24,
    "series": [
      {
        "t": "2025-11-10T13:00:00Z",
        "pct": 42.5
      },
      {
        "t": "2025-11-10T14:00:00Z",
        "pct": 38.2
      }
    ],
    "meta": {
      "limit_type": "per_minute",
      "limit_value": 30
    }
  }
]
```

**Response Fields:**

- `provider` (string): Provider name
- `hours` (integer): Number of hours covered
- `series` (array): Time series data points
  - `t` (string): ISO 8601 timestamp with 'Z' suffix
  - `pct` (number): Rate limit usage percentage [0-100]
- `meta` (object): Rate limit metadata
  - `limit_type` (string): Type of limit (per_second, per_minute, per_hour, per_day)
  - `limit_value` (integer|null): Limit value, null if no limit configured

**Behavior:**

- Returns one series object per provider
- Each series contains exactly `hours` data points (one per hour)
- Hours without data are filled with `pct: 0.0`
- If provider has no rate limit configured, returns `meta.limit_value: null` and `pct: 0`
- Default: Returns up to 5 providers with configured rate limits
- Series ordered chronologically (oldest to newest)

**Examples:**

```bash
# Default: Last 24 hours, top 5 providers
curl "http://localhost:7860/api/charts/rate-limit-history"

# Custom: 48 hours, specific providers
curl "http://localhost:7860/api/charts/rate-limit-history?hours=48&providers=coingecko,cmc,etherscan"

# Single provider, 1 week
curl "http://localhost:7860/api/charts/rate-limit-history?hours=168&providers=binance"
```

**Error Responses:**

- `400 Bad Request`: Invalid provider name
  ```json
  {
    "detail": "Invalid provider name: invalid_xyz. Must be one of: ..."
  }
  ```
- `422 Unprocessable Entity`: Invalid parameter type
- `500 Internal Server Error`: Database or processing error

---

### 2. `/api/charts/freshness-history`

**Purpose:** Retrieve hourly data freshness/staleness history for visualization.

**Method:** `GET`

**Parameters:**

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| `hours` | integer | No | 24 | 1-168 | Hours of history to retrieve (clamped server-side) |
| `providers` | string | No | top 5 | max 5, comma-separated | Provider names to include |

**Response Schema:**

```json
[
  {
    "provider": "coingecko",
    "hours": 24,
    "series": [
      {
        "t": "2025-11-10T13:00:00Z",
        "staleness_min": 7.2,
        "ttl_min": 15,
        "status": "fresh"
      },
      {
        "t": "2025-11-10T14:00:00Z",
        "staleness_min": 999.0,
        "ttl_min": 15,
        "status": "stale"
      }
    ],
    "meta": {
      "category": "market_data",
      "default_ttl": 1
    }
  }
]
```

**Response Fields:**

- `provider` (string): Provider name
- `hours` (integer): Number of hours covered
- `series` (array): Time series data points
  - `t` (string): ISO 8601 timestamp with 'Z' suffix
  - `staleness_min` (number): Data staleness in minutes (999.0 indicates no data)
  - `ttl_min` (integer): TTL threshold for this provider's category
  - `status` (string): Derived status: "fresh", "aging", or "stale"
- `meta` (object): Provider metadata
  - `category` (string): Provider category
  - `default_ttl` (integer): Default TTL for category (minutes)

**Status Derivation:**

```
fresh:  staleness_min <= ttl_min
aging:  ttl_min < staleness_min <= ttl_min * 2
stale:  staleness_min > ttl_min * 2  OR  no data (999.0)
```

**TTL by Category:**

| Category | TTL (minutes) |
|----------|---------------|
| market_data | 1 |
| blockchain_explorers | 5 |
| defi | 10 |
| news | 15 |
| default | 5 |

**Behavior:**

- Returns one series object per provider
- Each series contains exactly `hours` data points (one per hour)
- Hours without data are marked with `staleness_min: 999.0` and `status: "stale"`
- Default: Returns up to 5 most active providers
- Series ordered chronologically (oldest to newest)

**Examples:**

```bash
# Default: Last 24 hours, top 5 providers
curl "http://localhost:7860/api/charts/freshness-history"

# Custom: 72 hours, specific providers
curl "http://localhost:7860/api/charts/freshness-history?hours=72&providers=coingecko,binance"

# Single provider, 3 days
curl "http://localhost:7860/api/charts/freshness-history?hours=72&providers=etherscan"
```

**Error Responses:**

- `400 Bad Request`: Invalid provider name
- `422 Unprocessable Entity`: Invalid parameter type
- `500 Internal Server Error`: Database or processing error

---

## Security & Validation

### Input Validation

1. **Hours Parameter:**
   - Server-side clamping: `1 <= hours <= 168`
   - Invalid types rejected with `422 Unprocessable Entity`
   - Out-of-range values automatically clamped (no error)

2. **Providers Parameter:**
   - Allow-list enforcement: Only valid provider names accepted
   - Max 5 providers enforced (excess silently truncated)
   - Invalid names trigger `400 Bad Request` with detailed error
   - SQL injection prevention: No raw SQL, parameterized queries only
   - XSS prevention: Input sanitized (strip whitespace)

3. **Rate Limiting (Recommended):**
   - Implement: 60 requests/minute per IP for chart routes
   - Use middleware or reverse proxy (nginx/cloudflare)

### Security Measures Implemented

βœ“ Allow-list validation for provider names
βœ“ Parameter clamping (hours: 1-168)
βœ“ Max provider limit (5)
βœ“ SQL injection prevention (ORM with parameterized queries)
βœ“ XSS prevention (input sanitization)
βœ“ Comprehensive error handling with safe error messages
βœ“ Logging of all chart requests for monitoring
βœ“ No sensitive data exposure in responses

### Edge Cases Handled

- Empty provider list β†’ Returns default providers
- Unknown provider β†’ 400 with valid options listed
- Hours out of bounds β†’ Clamped to [1, 168]
- No data available β†’ Returns empty series or 999.0 staleness
- Provider with no rate limit β†’ Returns null limit_value
- Whitespace in provider names β†’ Trimmed automatically
- Mixed valid/invalid providers β†’ Rejects entire request

---

## Testing

### Automated Tests

Run the comprehensive test suite:

```bash
# Run all chart tests
pytest tests/test_charts.py -v

# Run specific test class
pytest tests/test_charts.py::TestRateLimitHistory -v

# Run with coverage
pytest tests/test_charts.py --cov=api --cov-report=html
```

**Test Coverage:**

- βœ“ Default parameter behavior
- βœ“ Custom time ranges (48h, 72h)
- βœ“ Provider selection and filtering
- βœ“ Response schema validation
- βœ“ Percentage range validation [0-100]
- βœ“ Timestamp format validation
- βœ“ Status derivation logic
- βœ“ Edge cases (invalid providers, hours clamping)
- βœ“ Security (SQL injection, XSS prevention)
- βœ“ Performance (response time < 500ms)
- βœ“ Concurrent request handling

### Manual Sanity Checks

Run the CLI sanity check script:

```bash
# Ensure backend is running
python app.py &

# Run sanity checks
./tests/sanity_checks.sh
```

**Checks performed:**

1. Rate limit history (default params)
2. Freshness history (default params)
3. Custom time ranges
4. Response schema validation
5. Invalid provider rejection
6. Hours parameter clamping
7. Performance measurement
8. Edge case handling

---

## Performance Targets

### Response Time (P95)

| Environment | Target | Conditions |
|-------------|--------|------------|
| Production | < 200ms | 24h / 5 providers |
| Development | < 500ms | 24h / 5 providers |

### Optimization Strategies

1. **Database Indexing:**
   - Indexed: `timestamp`, `provider_id` columns
   - Composite indexes on frequently queried combinations

2. **Query Optimization:**
   - Hourly bucketing done in-memory (fast)
   - Limited to 168 hours max (1 week)
   - Provider limit enforced early (max 5)

3. **Caching (Future Enhancement):**
   - Consider Redis cache for 1-minute TTL
   - Cache key: `chart:type:hours:providers`
   - Invalidate on new data ingestion

4. **Connection Pooling:**
   - SQLAlchemy pool size: 10
   - Max overflow: 20
   - Recycle connections every 3600s

---

## Observability & Monitoring

### Logging

All chart requests are logged with:

```json
{
  "timestamp": "2025-11-11T01:00:00Z",
  "level": "INFO",
  "logger": "api_endpoints",
  "message": "Rate limit history: 3 providers, 48h"
}
```

### Recommended Metrics (Prometheus/Grafana)

```python
# Counter: Total requests per endpoint
chart_requests_total{endpoint="rate_limit_history"} 1523

# Histogram: Response time distribution
chart_response_time_seconds{endpoint="rate_limit_history", le="0.1"} 1450
chart_response_time_seconds{endpoint="rate_limit_history", le="0.2"} 1510

# Gauge: Current rate limit usage per provider
ratelimit_usage_pct{provider="coingecko"} 87.5

# Gauge: Freshness staleness per provider
freshness_staleness_min{provider="binance"} 3.2

# Counter: Invalid request count
chart_invalid_requests_total{endpoint="rate_limit_history", reason="invalid_provider"} 23
```

### Recommended Alerts

```yaml
# Critical: Rate limit exhaustion
- alert: RateLimitExhaustion
  expr: ratelimit_usage_pct > 90
  for: 3h
  annotations:
    summary: "Provider {{ $labels.provider }} at {{ $value }}% rate limit"
    action: "Add API keys or reduce request frequency"

# Critical: Data staleness
- alert: DataStale
  expr: freshness_staleness_min > ttl_min
  for: 15m
  annotations:
    summary: "Provider {{ $labels.provider }} data is stale ({{ $value }}m old)"
    action: "Check scheduler, verify API connectivity"

# Warning: Chart endpoint slow
- alert: ChartEndpointSlow
  expr: histogram_quantile(0.95, chart_response_time_seconds) > 0.2
  for: 10m
  annotations:
    summary: "Chart endpoint P95 latency above 200ms"
    action: "Check database query performance"
```

---

## Database Schema

### Tables Used

**RateLimitUsage**
```sql
CREATE TABLE rate_limit_usage (
    id INTEGER PRIMARY KEY,
    timestamp DATETIME NOT NULL,  -- INDEXED
    provider_id INTEGER NOT NULL,  -- FOREIGN KEY, INDEXED
    limit_type VARCHAR(20),
    limit_value INTEGER,
    current_usage INTEGER,
    percentage REAL,
    reset_time DATETIME
);
```

**DataCollection**
```sql
CREATE TABLE data_collection (
    id INTEGER PRIMARY KEY,
    provider_id INTEGER NOT NULL,  -- FOREIGN KEY, INDEXED
    actual_fetch_time DATETIME NOT NULL,
    data_timestamp DATETIME,
    staleness_minutes REAL,
    record_count INTEGER,
    on_schedule BOOLEAN
);
```

---

## Frontend Integration

### Chart.js Example (Rate Limit)

```javascript
// Fetch rate limit history
const response = await fetch('/api/charts/rate-limit-history?hours=48&providers=coingecko,cmc');
const data = await response.json();

// Build Chart.js dataset
const datasets = data.map(series => ({
    label: series.provider,
    data: series.series.map(p => ({
        x: new Date(p.t),
        y: p.pct
    })),
    borderColor: getColorForProvider(series.provider),
    tension: 0.3
}));

// Create chart
new Chart(ctx, {
    type: 'line',
    data: { datasets },
    options: {
        scales: {
            x: { type: 'time', time: { unit: 'hour' } },
            y: { min: 0, max: 100, title: { text: 'Usage %' } }
        },
        interaction: { mode: 'index', intersect: false },
        plugins: {
            legend: { display: true, position: 'bottom' },
            tooltip: {
                callbacks: {
                    label: ctx => `${ctx.dataset.label}: ${ctx.parsed.y.toFixed(1)}%`
                }
            }
        }
    }
});
```

### Chart.js Example (Freshness)

```javascript
// Fetch freshness history
const response = await fetch('/api/charts/freshness-history?hours=72&providers=binance');
const data = await response.json();

// Build datasets with status-based colors
const datasets = data.map(series => ({
    label: series.provider,
    data: series.series.map(p => ({
        x: new Date(p.t),
        y: p.staleness_min,
        status: p.status
    })),
    borderColor: getColorForProvider(series.provider),
    segment: {
        borderColor: ctx => {
            const point = ctx.p1.$context.raw;
            return point.status === 'fresh' ? 'green'
                 : point.status === 'aging' ? 'orange'
                 : 'red';
        }
    }
}));

// Create chart with TTL reference line
new Chart(ctx, {
    type: 'line',
    data: { datasets },
    options: {
        scales: {
            x: { type: 'time' },
            y: { title: { text: 'Staleness (min)' } }
        },
        plugins: {
            annotation: {
                annotations: {
                    ttl: {
                        type: 'line',
                        yMin: data[0].meta.default_ttl,
                        yMax: data[0].meta.default_ttl,
                        borderColor: 'rgba(255, 99, 132, 0.5)',
                        borderWidth: 2,
                        label: { content: 'TTL Threshold', enabled: true }
                    }
                }
            }
        }
    }
});
```

---

## Troubleshooting

### Common Issues

**1. Empty series returned**

- Check if providers have data in the time range
- Verify provider names are correct (case-sensitive)
- Ensure database has historical data

**2. Response time > 500ms**

- Check database indexes exist
- Reduce `hours` parameter
- Limit number of providers
- Consider adding caching layer

**3. 400 Bad Request on valid provider**

- Verify provider is in database: `SELECT name FROM providers`
- Check for typos or case mismatch
- Ensure provider has not been renamed

**4. Missing data points (gaps in series)**

- Normal behavior: gaps filled with zeros/999.0
- Check data collection scheduler is running
- Review logs for collection failures

---

## Changelog

### v1.0.0 - 2025-11-11

**Added:**
- `/api/charts/rate-limit-history` endpoint
- `/api/charts/freshness-history` endpoint
- Comprehensive input validation
- Security hardening (allow-list, clamping, sanitization)
- Automated test suite (pytest)
- CLI sanity check script
- Full API documentation

**Security:**
- SQL injection prevention
- XSS prevention
- Parameter validation and clamping
- Allow-list enforcement for providers
- Max provider limit (5)

**Testing:**
- 20+ automated tests
- Schema validation tests
- Security tests
- Performance tests
- Edge case coverage

---

## Future Enhancements

### Phase 2 (Optional)

1. **Provider Picker UI Component**
   - Dropdown with multi-select (max 5)
   - Persist selection in localStorage
   - Auto-refresh on selection change

2. **Advanced Filtering**
   - Filter by category
   - Filter by rate limit status (ok/warning/critical)
   - Filter by freshness status (fresh/aging/stale)

3. **Aggregation Options**
   - Category-level aggregation
   - System-wide average/percentile
   - Compare providers side-by-side

4. **Export Functionality**
   - CSV export
   - JSON export
   - PNG/SVG chart export

5. **Real-time Updates**
   - WebSocket streaming for live updates
   - Auto-refresh without flicker
   - Smooth transitions on new data

6. **Historical Analysis**
   - Trend detection (improving/degrading)
   - Anomaly detection
   - Predictive alerts

---

## Support & Maintenance

### Code Location

- Endpoints: `api/endpoints.py` (lines 947-1250)
- Tests: `tests/test_charts.py`
- Sanity checks: `tests/sanity_checks.sh`
- Documentation: `CHARTS_VALIDATION_DOCUMENTATION.md`

### Contact

For issues or questions:
- Create GitHub issue with `[charts]` prefix
- Tag: `enhancement`, `bug`, or `documentation`
- Provide: Request details, expected vs actual behavior, logs

---

## License

Same as parent project.