Spaces:

Really-amin
/

Datasourceforcryptocurrency

Running

App Files Files Community

Datasourceforcryptocurrency / APL_USAGE_GUIDE.md

Really-amin

Upload 325 files

9d92c17 verified 7 days ago

preview code

raw

history blame contribute delete

12.4 kB

Auto Provider Loader (APL) - Usage Guide

Version: 1.0
Last Updated: 2025-11-16
Status: PRODUCTION READY ✅

Overview

The Auto Provider Loader (APL) is a real-data-only system that automatically discovers, validates, and integrates cryptocurrency data providers (both HTTP APIs and Hugging Face models) into your application.

Key Features

🔍 Automatic Discovery - Scans JSON resources for provider definitions
✅ Real Validation - Tests each provider with actual API calls (NO MOCKS)
🔧 Smart Integration - Automatically adds valid providers to config
📊 Comprehensive Reports - Generates detailed validation reports
⚡ Performance Optimized - Parallel validation with configurable timeouts
🛡️ Auth Handling - Detects and handles API key requirements

Architecture

Components

provider_validator.py - Core validation engine
- Validates HTTP JSON APIs
- Validates HTTP RPC endpoints
- Validates Hugging Face models
- Handles authentication requirements
auto_provider_loader.py - Discovery and orchestration
- Scans resource files
- Coordinates validation
- Integrates valid providers
- Generates reports

Provider Types Supported

Type	Description	Example
`HTTP_JSON`	REST APIs returning JSON	CoinGecko, CoinPaprika
`HTTP_RPC`	JSON-RPC endpoints	Ethereum nodes, BSC RPC
`WEBSOCKET`	WebSocket connections	Alchemy WS, real-time feeds
`HF_MODEL`	Hugging Face models	Sentiment analysis models

Quick Start

1. Basic Usage

Run the APL to discover and validate all providers:

cd /workspace
python3 auto_provider_loader.py

This will:

Scan api-resources/*.json for provider definitions
Scan providers_config*.json for existing providers
Discover HF models from backend/services/
Validate each provider with real API calls
Generate comprehensive reports
Update providers_config_extended.json with valid providers

2. Understanding Output

================================================================================
🚀 AUTO PROVIDER LOADER (APL) - REAL DATA ONLY
================================================================================

📡 PHASE 1: DISCOVERY
  Found 339 HTTP provider candidates
  Found 4 HF model candidates

🔬 PHASE 2: VALIDATION
  ✅ Valid providers
  ❌ Invalid providers
  ⚠️  Conditionally available (requires auth)

📊 PHASE 3: COMPUTING STATISTICS
🔧 PHASE 4: INTEGRATION
📝 PHASE 5: GENERATING REPORTS

3. Generated Files

After running APL, you'll find:

PROVIDER_AUTO_DISCOVERY_REPORT.md - Human-readable report
PROVIDER_AUTO_DISCOVERY_REPORT.json - Machine-readable detailed results
providers_config_extended.backup.{timestamp}.json - Config backup
providers_config_extended.json - Updated with new valid providers

Validation Logic

HTTP Providers

For each HTTP provider, APL:

Checks URL structure
- Detects placeholder variables ({API_KEY}, {PROJECT_ID})
- Identifies WebSocket endpoints (ws://, wss://)
Determines endpoint type
- JSON REST API → GET request to test endpoint
- JSON-RPC → POST request with eth_blockNumber method
Makes real test call
- 8-second timeout
- Handles redirects
- Validates response format
Classifies result
- ✅ VALID - Responds with 200 OK and valid data
- ❌ INVALID - Connection fails, timeout, or error response
- ⚠️ CONDITIONALLY_AVAILABLE - Requires API key (401/403)
- ⏭️ SKIPPED - WebSocket (requires separate validation)

Hugging Face Models

For each HF model, APL:

Queries HF Hub API
- Checks if model exists: GET https://huggingface.co/api/models/{model_id}
- Does NOT download or load the full model (saves time/resources)
Validates accessibility
- ✅ VALID - Model found and publicly accessible
- ⚠️ CONDITIONALLY_AVAILABLE - Requires HF_TOKEN
- ❌ INVALID - Model not found (404) or other error

Configuration

Environment Variables

APL respects these environment variables:

Variable	Purpose	Default
`HF_TOKEN`	Hugging Face API token	None
`ETHERSCAN_API_KEY`	Etherscan API key	None
`BSCSCAN_API_KEY`	BSCScan API key	None
`INFURA_PROJECT_ID`	Infura project ID	None
`ALCHEMY_API_KEY`	Alchemy API key	None

Validation Timeout

Default timeout is 8 seconds. To customize:

from auto_provider_loader import AutoProviderLoader

apl = AutoProviderLoader()
apl.validator.timeout = 15.0  # 15 seconds
await apl.run()

Adding New Provider Sources

1. Add to JSON Resources

Create or update a JSON file in api-resources/:

{
  "registry": {
    "my_providers": [
      {
        "id": "my_api",
        "name": "My API",
        "category": "market_data",
        "base_url": "https://api.example.com/v1",
        "endpoints": {
          "prices": "/prices"
        },
        "auth": {
          "type": "none"
        }
      }
    ]
  }
}

2. Re-run APL

python3 auto_provider_loader.py

APL will automatically discover and validate your new provider.

Integration with Existing Code

Using Validated Providers

After APL runs, valid providers are in providers_config_extended.json:

import json

# Load validated providers
with open('providers_config_extended.json', 'r') as f:
    config = json.load(f)

# Get all valid providers
valid_providers = config['providers']

# Use a specific provider
coingecko = valid_providers['coingecko']
print(f"Provider: {coingecko['name']}")
print(f"Category: {coingecko['category']}")
print(f"Response time: {coingecko['response_time_ms']}ms")

Filtering by Category

# Get all market data providers
market_providers = {
    pid: data for pid, data in valid_providers.items()
    if data.get('category') == 'market_data'
}

Conditional Providers

Providers marked as CONDITIONALLY_AVAILABLE require API keys:

1. Check Requirements

See PROVIDER_AUTO_DISCOVERY_REPORT.md for required env vars:

### Conditionally Available Providers (90)

- **Etherscan** (`etherscan_primary`)
  - Required: `ETHERSCAN_PRIMARY_API_KEY` environment variable
  - Reason: HTTP 401 - Requires authentication

2. Set Environment Variables

export ETHERSCAN_API_KEY="your_key_here"
export BSCSCAN_API_KEY="your_key_here"

3. Re-run Validation

python3 auto_provider_loader.py

Previously conditional providers will now validate as VALID if keys are correct.

Performance Tuning

Parallel Validation

HTTP providers are validated in batches of 10 to balance speed and resource usage:

# In auto_provider_loader.py
batch_size = 10  # Adjust based on your needs

Larger batches = faster but more network load
Smaller batches = slower but more conservative

Timeout Adjustment

For slow or distant APIs:

validator = ProviderValidator(timeout=15.0)  # 15 seconds

Troubleshooting

Issue: Many providers marked INVALID

Possible causes:

Network connectivity issues
Rate limiting (try again later)
Providers genuinely down

Solution: Check individual error reasons in report

Issue: All providers CONDITIONALLY_AVAILABLE

Cause: Most providers require API keys

Solution: Set required environment variables

Issue: HF models all INVALID

Causes:

No internet connection to HuggingFace
Models moved or renamed
Rate limiting from HF Hub

Solution: Check HF Hub status, verify model IDs

Issue: Validation takes too long

Solutions:

Reduce batch size
Decrease timeout
Filter providers before validation

Advanced Usage

Validating Specific Providers

from provider_validator import ProviderValidator
import asyncio

async def validate_one():
    validator = ProviderValidator()
    
    result = await validator.validate_http_provider(
        "coingecko",
        {
            "name": "CoinGecko",
            "category": "market_data",
            "base_url": "https://api.coingecko.com/api/v3",
            "endpoints": {"ping": "/ping"}
        }
    )
    
    print(f"Status: {result.status}")
    print(f"Response time: {result.response_time_ms}ms")

asyncio.run(validate_one())

Custom Discovery Logic

from auto_provider_loader import AutoProviderLoader

class CustomAPL(AutoProviderLoader):
    def discover_http_providers(self):
        # Your custom logic
        providers = super().discover_http_providers()
        # Filter or augment
        return [p for p in providers if p['data'].get('free') == True]

apl = CustomAPL()
await apl.run()

API Reference

ProviderValidator

class ProviderValidator:
    def __init__(self, timeout: float = 10.0)
    
    async def validate_http_provider(
        provider_id: str,
        provider_data: Dict[str, Any]
    ) -> ValidationResult
    
    async def validate_hf_model(
        model_id: str,
        model_name: str,
        pipeline_tag: str = "sentiment-analysis"
    ) -> ValidationResult
    
    def get_summary() -> Dict[str, Any]

AutoProviderLoader

class AutoProviderLoader:
    def __init__(self, workspace_root: str = "/workspace")
    
    def discover_http_providers() -> List[Dict[str, Any]]
    def discover_hf_models() -> List[Dict[str, Any]]
    
    async def validate_all_http_providers(providers: List)
    async def validate_all_hf_models(models: List)
    
    def integrate_valid_providers() -> Dict[str, Any]
    def generate_reports()
    
    async def run()  # Main entry point

Best Practices

Regular Re-validation
- Run APL weekly to catch provider changes
- Providers can go offline or change endpoints
Monitor Conditional Providers
- Set up API keys for high-value providers
- Track which providers need auth
Review Reports
- Check invalid providers for patterns
- Update configs based on error reasons
Backup Configs
- APL creates automatic backups
- Keep manual backups before major changes
Test Integration
- After APL runs, test your application
- Verify new providers work in your context

Zero Mock/Fake Data Guarantee

APL NEVER uses mock or fake data.

All validations are REAL API calls
All response times are ACTUAL measurements
All status classifications based on REAL responses
Invalid providers are GENUINELY unreachable
Valid providers are GENUINELY functional

This guarantee ensures:

Production-ready validation results
Accurate performance metrics
Trustworthy provider recommendations
No surprises in production

Support

Documentation

PROVIDER_AUTO_DISCOVERY_REPORT.md - Latest validation results
APL_FINAL_SUMMARY.md - Implementation summary
This guide - Usage instructions

Common Questions

Q: Can I use APL in CI/CD?
A: Yes! Run python3 auto_provider_loader.py in your pipeline.

Q: How often should I run APL?
A: Weekly for production, daily for development.

Q: Can I add custom provider types?
A: Yes, extend ProviderValidator class with new validation methods.

Q: Does APL support GraphQL APIs?
A: Not yet, but you can extend it by adding GraphQL validation logic.

Version History

v1.0 (2025-11-16)

Initial release
HTTP JSON validation
HTTP RPC validation
HF model validation (API-based, lightweight)
Automatic discovery from JSON resources
Comprehensive reporting
Zero mock data guarantee

Auto Provider Loader - Real Data Only, Always.