Datasourceforcryptocurrency / APL_USAGE_GUIDE.md
Really-amin's picture
Upload 325 files
9d92c17 verified

Auto Provider Loader (APL) - Usage Guide

Version: 1.0
Last Updated: 2025-11-16
Status: PRODUCTION READY βœ…


Overview

The Auto Provider Loader (APL) is a real-data-only system that automatically discovers, validates, and integrates cryptocurrency data providers (both HTTP APIs and Hugging Face models) into your application.

Key Features

  • πŸ” Automatic Discovery - Scans JSON resources for provider definitions
  • βœ… Real Validation - Tests each provider with actual API calls (NO MOCKS)
  • πŸ”§ Smart Integration - Automatically adds valid providers to config
  • πŸ“Š Comprehensive Reports - Generates detailed validation reports
  • ⚑ Performance Optimized - Parallel validation with configurable timeouts
  • πŸ›‘οΈ Auth Handling - Detects and handles API key requirements

Architecture

Components

  1. provider_validator.py - Core validation engine

    • Validates HTTP JSON APIs
    • Validates HTTP RPC endpoints
    • Validates Hugging Face models
    • Handles authentication requirements
  2. auto_provider_loader.py - Discovery and orchestration

    • Scans resource files
    • Coordinates validation
    • Integrates valid providers
    • Generates reports

Provider Types Supported

Type Description Example
HTTP_JSON REST APIs returning JSON CoinGecko, CoinPaprika
HTTP_RPC JSON-RPC endpoints Ethereum nodes, BSC RPC
WEBSOCKET WebSocket connections Alchemy WS, real-time feeds
HF_MODEL Hugging Face models Sentiment analysis models

Quick Start

1. Basic Usage

Run the APL to discover and validate all providers:

cd /workspace
python3 auto_provider_loader.py

This will:

  • Scan api-resources/*.json for provider definitions
  • Scan providers_config*.json for existing providers
  • Discover HF models from backend/services/
  • Validate each provider with real API calls
  • Generate comprehensive reports
  • Update providers_config_extended.json with valid providers

2. Understanding Output

================================================================================
πŸš€ AUTO PROVIDER LOADER (APL) - REAL DATA ONLY
================================================================================

πŸ“‘ PHASE 1: DISCOVERY
  Found 339 HTTP provider candidates
  Found 4 HF model candidates

πŸ”¬ PHASE 2: VALIDATION
  βœ… Valid providers
  ❌ Invalid providers
  ⚠️  Conditionally available (requires auth)

πŸ“Š PHASE 3: COMPUTING STATISTICS
πŸ”§ PHASE 4: INTEGRATION
πŸ“ PHASE 5: GENERATING REPORTS

3. Generated Files

After running APL, you'll find:

  • PROVIDER_AUTO_DISCOVERY_REPORT.md - Human-readable report
  • PROVIDER_AUTO_DISCOVERY_REPORT.json - Machine-readable detailed results
  • providers_config_extended.backup.{timestamp}.json - Config backup
  • providers_config_extended.json - Updated with new valid providers

Validation Logic

HTTP Providers

For each HTTP provider, APL:

  1. Checks URL structure

    • Detects placeholder variables ({API_KEY}, {PROJECT_ID})
    • Identifies WebSocket endpoints (ws://, wss://)
  2. Determines endpoint type

    • JSON REST API β†’ GET request to test endpoint
    • JSON-RPC β†’ POST request with eth_blockNumber method
  3. Makes real test call

    • 8-second timeout
    • Handles redirects
    • Validates response format
  4. Classifies result

    • βœ… VALID - Responds with 200 OK and valid data
    • ❌ INVALID - Connection fails, timeout, or error response
    • ⚠️ CONDITIONALLY_AVAILABLE - Requires API key (401/403)
    • ⏭️ SKIPPED - WebSocket (requires separate validation)

Hugging Face Models

For each HF model, APL:

  1. Queries HF Hub API

    • Checks if model exists: GET https://huggingface.co/api/models/{model_id}
    • Does NOT download or load the full model (saves time/resources)
  2. Validates accessibility

    • βœ… VALID - Model found and publicly accessible
    • ⚠️ CONDITIONALLY_AVAILABLE - Requires HF_TOKEN
    • ❌ INVALID - Model not found (404) or other error

Configuration

Environment Variables

APL respects these environment variables:

Variable Purpose Default
HF_TOKEN Hugging Face API token None
ETHERSCAN_API_KEY Etherscan API key None
BSCSCAN_API_KEY BSCScan API key None
INFURA_PROJECT_ID Infura project ID None
ALCHEMY_API_KEY Alchemy API key None

Validation Timeout

Default timeout is 8 seconds. To customize:

from auto_provider_loader import AutoProviderLoader

apl = AutoProviderLoader()
apl.validator.timeout = 15.0  # 15 seconds
await apl.run()

Adding New Provider Sources

1. Add to JSON Resources

Create or update a JSON file in api-resources/:

{
  "registry": {
    "my_providers": [
      {
        "id": "my_api",
        "name": "My API",
        "category": "market_data",
        "base_url": "https://api.example.com/v1",
        "endpoints": {
          "prices": "/prices"
        },
        "auth": {
          "type": "none"
        }
      }
    ]
  }
}

2. Re-run APL

python3 auto_provider_loader.py

APL will automatically discover and validate your new provider.


Integration with Existing Code

Using Validated Providers

After APL runs, valid providers are in providers_config_extended.json:

import json

# Load validated providers
with open('providers_config_extended.json', 'r') as f:
    config = json.load(f)

# Get all valid providers
valid_providers = config['providers']

# Use a specific provider
coingecko = valid_providers['coingecko']
print(f"Provider: {coingecko['name']}")
print(f"Category: {coingecko['category']}")
print(f"Response time: {coingecko['response_time_ms']}ms")

Filtering by Category

# Get all market data providers
market_providers = {
    pid: data for pid, data in valid_providers.items()
    if data.get('category') == 'market_data'
}

Conditional Providers

Providers marked as CONDITIONALLY_AVAILABLE require API keys:

1. Check Requirements

See PROVIDER_AUTO_DISCOVERY_REPORT.md for required env vars:

### Conditionally Available Providers (90)

- **Etherscan** (`etherscan_primary`)
  - Required: `ETHERSCAN_PRIMARY_API_KEY` environment variable
  - Reason: HTTP 401 - Requires authentication

2. Set Environment Variables

export ETHERSCAN_API_KEY="your_key_here"
export BSCSCAN_API_KEY="your_key_here"

3. Re-run Validation

python3 auto_provider_loader.py

Previously conditional providers will now validate as VALID if keys are correct.


Performance Tuning

Parallel Validation

HTTP providers are validated in batches of 10 to balance speed and resource usage:

# In auto_provider_loader.py
batch_size = 10  # Adjust based on your needs

Larger batches = faster but more network load
Smaller batches = slower but more conservative

Timeout Adjustment

For slow or distant APIs:

validator = ProviderValidator(timeout=15.0)  # 15 seconds

Troubleshooting

Issue: Many providers marked INVALID

Possible causes:

  • Network connectivity issues
  • Rate limiting (try again later)
  • Providers genuinely down

Solution: Check individual error reasons in report

Issue: All providers CONDITIONALLY_AVAILABLE

Cause: Most providers require API keys

Solution: Set required environment variables

Issue: HF models all INVALID

Causes:

  • No internet connection to HuggingFace
  • Models moved or renamed
  • Rate limiting from HF Hub

Solution: Check HF Hub status, verify model IDs

Issue: Validation takes too long

Solutions:

  • Reduce batch size
  • Decrease timeout
  • Filter providers before validation

Advanced Usage

Validating Specific Providers

from provider_validator import ProviderValidator
import asyncio

async def validate_one():
    validator = ProviderValidator()
    
    result = await validator.validate_http_provider(
        "coingecko",
        {
            "name": "CoinGecko",
            "category": "market_data",
            "base_url": "https://api.coingecko.com/api/v3",
            "endpoints": {"ping": "/ping"}
        }
    )
    
    print(f"Status: {result.status}")
    print(f"Response time: {result.response_time_ms}ms")

asyncio.run(validate_one())

Custom Discovery Logic

from auto_provider_loader import AutoProviderLoader

class CustomAPL(AutoProviderLoader):
    def discover_http_providers(self):
        # Your custom logic
        providers = super().discover_http_providers()
        # Filter or augment
        return [p for p in providers if p['data'].get('free') == True]

apl = CustomAPL()
await apl.run()

API Reference

ProviderValidator

class ProviderValidator:
    def __init__(self, timeout: float = 10.0)
    
    async def validate_http_provider(
        provider_id: str,
        provider_data: Dict[str, Any]
    ) -> ValidationResult
    
    async def validate_hf_model(
        model_id: str,
        model_name: str,
        pipeline_tag: str = "sentiment-analysis"
    ) -> ValidationResult
    
    def get_summary() -> Dict[str, Any]

AutoProviderLoader

class AutoProviderLoader:
    def __init__(self, workspace_root: str = "/workspace")
    
    def discover_http_providers() -> List[Dict[str, Any]]
    def discover_hf_models() -> List[Dict[str, Any]]
    
    async def validate_all_http_providers(providers: List)
    async def validate_all_hf_models(models: List)
    
    def integrate_valid_providers() -> Dict[str, Any]
    def generate_reports()
    
    async def run()  # Main entry point

Best Practices

  1. Regular Re-validation

    • Run APL weekly to catch provider changes
    • Providers can go offline or change endpoints
  2. Monitor Conditional Providers

    • Set up API keys for high-value providers
    • Track which providers need auth
  3. Review Reports

    • Check invalid providers for patterns
    • Update configs based on error reasons
  4. Backup Configs

    • APL creates automatic backups
    • Keep manual backups before major changes
  5. Test Integration

    • After APL runs, test your application
    • Verify new providers work in your context

Zero Mock/Fake Data Guarantee

APL NEVER uses mock or fake data.

  • All validations are REAL API calls
  • All response times are ACTUAL measurements
  • All status classifications based on REAL responses
  • Invalid providers are GENUINELY unreachable
  • Valid providers are GENUINELY functional

This guarantee ensures:

  • Production-ready validation results
  • Accurate performance metrics
  • Trustworthy provider recommendations
  • No surprises in production

Support

Documentation

  • PROVIDER_AUTO_DISCOVERY_REPORT.md - Latest validation results
  • APL_FINAL_SUMMARY.md - Implementation summary
  • This guide - Usage instructions

Common Questions

Q: Can I use APL in CI/CD?
A: Yes! Run python3 auto_provider_loader.py in your pipeline.

Q: How often should I run APL?
A: Weekly for production, daily for development.

Q: Can I add custom provider types?
A: Yes, extend ProviderValidator class with new validation methods.

Q: Does APL support GraphQL APIs?
A: Not yet, but you can extend it by adding GraphQL validation logic.


Version History

v1.0 (2025-11-16)

  • Initial release
  • HTTP JSON validation
  • HTTP RPC validation
  • HF model validation (API-based, lightweight)
  • Automatic discovery from JSON resources
  • Comprehensive reporting
  • Zero mock data guarantee

Auto Provider Loader - Real Data Only, Always.