Auto Provider Loader (APL) - Usage Guide
Version: 1.0
Last Updated: 2025-11-16
Status: PRODUCTION READY β
Overview
The Auto Provider Loader (APL) is a real-data-only system that automatically discovers, validates, and integrates cryptocurrency data providers (both HTTP APIs and Hugging Face models) into your application.
Key Features
- π Automatic Discovery - Scans JSON resources for provider definitions
- β Real Validation - Tests each provider with actual API calls (NO MOCKS)
- π§ Smart Integration - Automatically adds valid providers to config
- π Comprehensive Reports - Generates detailed validation reports
- β‘ Performance Optimized - Parallel validation with configurable timeouts
- π‘οΈ Auth Handling - Detects and handles API key requirements
Architecture
Components
provider_validator.py - Core validation engine
- Validates HTTP JSON APIs
- Validates HTTP RPC endpoints
- Validates Hugging Face models
- Handles authentication requirements
auto_provider_loader.py - Discovery and orchestration
- Scans resource files
- Coordinates validation
- Integrates valid providers
- Generates reports
Provider Types Supported
| Type | Description | Example |
|---|---|---|
HTTP_JSON |
REST APIs returning JSON | CoinGecko, CoinPaprika |
HTTP_RPC |
JSON-RPC endpoints | Ethereum nodes, BSC RPC |
WEBSOCKET |
WebSocket connections | Alchemy WS, real-time feeds |
HF_MODEL |
Hugging Face models | Sentiment analysis models |
Quick Start
1. Basic Usage
Run the APL to discover and validate all providers:
cd /workspace
python3 auto_provider_loader.py
This will:
- Scan
api-resources/*.jsonfor provider definitions - Scan
providers_config*.jsonfor existing providers - Discover HF models from
backend/services/ - Validate each provider with real API calls
- Generate comprehensive reports
- Update
providers_config_extended.jsonwith valid providers
2. Understanding Output
================================================================================
π AUTO PROVIDER LOADER (APL) - REAL DATA ONLY
================================================================================
π‘ PHASE 1: DISCOVERY
Found 339 HTTP provider candidates
Found 4 HF model candidates
π¬ PHASE 2: VALIDATION
β
Valid providers
β Invalid providers
β οΈ Conditionally available (requires auth)
π PHASE 3: COMPUTING STATISTICS
π§ PHASE 4: INTEGRATION
π PHASE 5: GENERATING REPORTS
3. Generated Files
After running APL, you'll find:
PROVIDER_AUTO_DISCOVERY_REPORT.md- Human-readable reportPROVIDER_AUTO_DISCOVERY_REPORT.json- Machine-readable detailed resultsproviders_config_extended.backup.{timestamp}.json- Config backupproviders_config_extended.json- Updated with new valid providers
Validation Logic
HTTP Providers
For each HTTP provider, APL:
Checks URL structure
- Detects placeholder variables (
{API_KEY},{PROJECT_ID}) - Identifies WebSocket endpoints (
ws://,wss://)
- Detects placeholder variables (
Determines endpoint type
- JSON REST API β GET request to test endpoint
- JSON-RPC β POST request with
eth_blockNumbermethod
Makes real test call
- 8-second timeout
- Handles redirects
- Validates response format
Classifies result
- β
VALID- Responds with 200 OK and valid data - β
INVALID- Connection fails, timeout, or error response - β οΈ
CONDITIONALLY_AVAILABLE- Requires API key (401/403) - βοΈ
SKIPPED- WebSocket (requires separate validation)
- β
Hugging Face Models
For each HF model, APL:
Queries HF Hub API
- Checks if model exists:
GET https://huggingface.co/api/models/{model_id} - Does NOT download or load the full model (saves time/resources)
- Checks if model exists:
Validates accessibility
- β
VALID- Model found and publicly accessible - β οΈ
CONDITIONALLY_AVAILABLE- Requires HF_TOKEN - β
INVALID- Model not found (404) or other error
- β
Configuration
Environment Variables
APL respects these environment variables:
| Variable | Purpose | Default |
|---|---|---|
HF_TOKEN |
Hugging Face API token | None |
ETHERSCAN_API_KEY |
Etherscan API key | None |
BSCSCAN_API_KEY |
BSCScan API key | None |
INFURA_PROJECT_ID |
Infura project ID | None |
ALCHEMY_API_KEY |
Alchemy API key | None |
Validation Timeout
Default timeout is 8 seconds. To customize:
from auto_provider_loader import AutoProviderLoader
apl = AutoProviderLoader()
apl.validator.timeout = 15.0 # 15 seconds
await apl.run()
Adding New Provider Sources
1. Add to JSON Resources
Create or update a JSON file in api-resources/:
{
"registry": {
"my_providers": [
{
"id": "my_api",
"name": "My API",
"category": "market_data",
"base_url": "https://api.example.com/v1",
"endpoints": {
"prices": "/prices"
},
"auth": {
"type": "none"
}
}
]
}
}
2. Re-run APL
python3 auto_provider_loader.py
APL will automatically discover and validate your new provider.
Integration with Existing Code
Using Validated Providers
After APL runs, valid providers are in providers_config_extended.json:
import json
# Load validated providers
with open('providers_config_extended.json', 'r') as f:
config = json.load(f)
# Get all valid providers
valid_providers = config['providers']
# Use a specific provider
coingecko = valid_providers['coingecko']
print(f"Provider: {coingecko['name']}")
print(f"Category: {coingecko['category']}")
print(f"Response time: {coingecko['response_time_ms']}ms")
Filtering by Category
# Get all market data providers
market_providers = {
pid: data for pid, data in valid_providers.items()
if data.get('category') == 'market_data'
}
Conditional Providers
Providers marked as CONDITIONALLY_AVAILABLE require API keys:
1. Check Requirements
See PROVIDER_AUTO_DISCOVERY_REPORT.md for required env vars:
### Conditionally Available Providers (90)
- **Etherscan** (`etherscan_primary`)
- Required: `ETHERSCAN_PRIMARY_API_KEY` environment variable
- Reason: HTTP 401 - Requires authentication
2. Set Environment Variables
export ETHERSCAN_API_KEY="your_key_here"
export BSCSCAN_API_KEY="your_key_here"
3. Re-run Validation
python3 auto_provider_loader.py
Previously conditional providers will now validate as VALID if keys are correct.
Performance Tuning
Parallel Validation
HTTP providers are validated in batches of 10 to balance speed and resource usage:
# In auto_provider_loader.py
batch_size = 10 # Adjust based on your needs
Larger batches = faster but more network load
Smaller batches = slower but more conservative
Timeout Adjustment
For slow or distant APIs:
validator = ProviderValidator(timeout=15.0) # 15 seconds
Troubleshooting
Issue: Many providers marked INVALID
Possible causes:
- Network connectivity issues
- Rate limiting (try again later)
- Providers genuinely down
Solution: Check individual error reasons in report
Issue: All providers CONDITIONALLY_AVAILABLE
Cause: Most providers require API keys
Solution: Set required environment variables
Issue: HF models all INVALID
Causes:
- No internet connection to HuggingFace
- Models moved or renamed
- Rate limiting from HF Hub
Solution: Check HF Hub status, verify model IDs
Issue: Validation takes too long
Solutions:
- Reduce batch size
- Decrease timeout
- Filter providers before validation
Advanced Usage
Validating Specific Providers
from provider_validator import ProviderValidator
import asyncio
async def validate_one():
validator = ProviderValidator()
result = await validator.validate_http_provider(
"coingecko",
{
"name": "CoinGecko",
"category": "market_data",
"base_url": "https://api.coingecko.com/api/v3",
"endpoints": {"ping": "/ping"}
}
)
print(f"Status: {result.status}")
print(f"Response time: {result.response_time_ms}ms")
asyncio.run(validate_one())
Custom Discovery Logic
from auto_provider_loader import AutoProviderLoader
class CustomAPL(AutoProviderLoader):
def discover_http_providers(self):
# Your custom logic
providers = super().discover_http_providers()
# Filter or augment
return [p for p in providers if p['data'].get('free') == True]
apl = CustomAPL()
await apl.run()
API Reference
ProviderValidator
class ProviderValidator:
def __init__(self, timeout: float = 10.0)
async def validate_http_provider(
provider_id: str,
provider_data: Dict[str, Any]
) -> ValidationResult
async def validate_hf_model(
model_id: str,
model_name: str,
pipeline_tag: str = "sentiment-analysis"
) -> ValidationResult
def get_summary() -> Dict[str, Any]
AutoProviderLoader
class AutoProviderLoader:
def __init__(self, workspace_root: str = "/workspace")
def discover_http_providers() -> List[Dict[str, Any]]
def discover_hf_models() -> List[Dict[str, Any]]
async def validate_all_http_providers(providers: List)
async def validate_all_hf_models(models: List)
def integrate_valid_providers() -> Dict[str, Any]
def generate_reports()
async def run() # Main entry point
Best Practices
Regular Re-validation
- Run APL weekly to catch provider changes
- Providers can go offline or change endpoints
Monitor Conditional Providers
- Set up API keys for high-value providers
- Track which providers need auth
Review Reports
- Check invalid providers for patterns
- Update configs based on error reasons
Backup Configs
- APL creates automatic backups
- Keep manual backups before major changes
Test Integration
- After APL runs, test your application
- Verify new providers work in your context
Zero Mock/Fake Data Guarantee
APL NEVER uses mock or fake data.
- All validations are REAL API calls
- All response times are ACTUAL measurements
- All status classifications based on REAL responses
- Invalid providers are GENUINELY unreachable
- Valid providers are GENUINELY functional
This guarantee ensures:
- Production-ready validation results
- Accurate performance metrics
- Trustworthy provider recommendations
- No surprises in production
Support
Documentation
PROVIDER_AUTO_DISCOVERY_REPORT.md- Latest validation resultsAPL_FINAL_SUMMARY.md- Implementation summary- This guide - Usage instructions
Common Questions
Q: Can I use APL in CI/CD?
A: Yes! Run python3 auto_provider_loader.py in your pipeline.
Q: How often should I run APL?
A: Weekly for production, daily for development.
Q: Can I add custom provider types?
A: Yes, extend ProviderValidator class with new validation methods.
Q: Does APL support GraphQL APIs?
A: Not yet, but you can extend it by adding GraphQL validation logic.
Version History
v1.0 (2025-11-16)
- Initial release
- HTTP JSON validation
- HTTP RPC validation
- HF model validation (API-based, lightweight)
- Automatic discovery from JSON resources
- Comprehensive reporting
- Zero mock data guarantee
Auto Provider Loader - Real Data Only, Always.