Datasourceforcryptocurrency / docs /components /CRYPTO_DATA_BANK_README.md
Really-amin's picture
Upload 301 files
e4e4574 verified

🏦 Crypto Data Bank - بانک اطلاعاتی قدرتمند رمزارز

📋 Overview | نمای کلی

Crypto Data Bank is a powerful cryptocurrency data aggregation system running on HuggingFace Spaces that acts as an intelligent gateway between data consumers and 200+ free data sources.

بانک اطلاعاتی رمزارز یک سیستم قدرتمند جمع‌آوری داده که روی HuggingFace Spaces اجرا می‌شود و به عنوان دروازه‌ای هوشمند بین مصرف‌کنندگان داده و بیش از 200 منبع رایگان عمل می‌کند.

🎯 Key Features | ویژگی‌های کلیدی

100% FREE Data Sources - No API keys required for basic functionality ✅ Real-time Price Data - From 5+ free providers (CoinCap, CoinGecko, Binance, Kraken, CryptoCompare) ✅ News Aggregation - 8+ RSS feeds (CoinTelegraph, CoinDesk, Bitcoin Magazine, etc.) ✅ Market Sentiment - Fear & Greed Index, BTC Dominance, Global Stats ✅ HuggingFace AI Models - Sentiment analysis with FinBERT, categorization with BART ✅ Intelligent Caching - Database-backed caching for fast responses ✅ Background Collection - Continuous data gathering in the background ✅ REST API Gateway - FastAPI-based API with automatic documentation


🏗️ Architecture | معماری

┌─────────────────────────────────────────────────────────────────┐
│                    API Gateway (FastAPI)                        │
│                  http://localhost:8888                          │
│                                                                 │
│  Endpoints:                                                     │
│  • /api/prices          - Real-time cryptocurrency prices      │
│  • /api/news            - Aggregated crypto news               │
│  • /api/sentiment       - Market sentiment analysis            │
│  • /api/market/overview - Complete market overview             │
│  • /api/trending        - Trending coins from news             │
│  • /api/ai/analysis     - AI-powered analysis                  │
└─────────────────────────────────────────────────────────────────┘
                               ↕
┌─────────────────────────────────────────────────────────────────┐
│                    Orchestrator Layer                           │
│            (Background Data Collection)                         │
│                                                                 │
│  • Prices: Collected every 60 seconds                          │
│  • News: Collected every 5 minutes                             │
│  • Sentiment: Collected every 3 minutes                        │
└─────────────────────────────────────────────────────────────────┘
                               ↕
┌─────────────────────────────────────────────────────────────────┐
│                    Collector Layer                              │
│                                                                 │
│  ┌─────────────────┐  ┌─────────────────┐  ┌────────────────┐ │
│  │ Price Collector │  │ News Collector  │  │   Sentiment    │ │
│  │   (5 sources)   │  │   (8 sources)   │  │   Collector    │ │
│  └─────────────────┘  └─────────────────┘  └────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
                               ↕
┌─────────────────────────────────────────────────────────────────┐
│                    AI Analysis Layer                            │
│               (HuggingFace Models)                              │
│                                                                 │
│  • FinBERT - Financial sentiment analysis                      │
│  • BART-MNLI - News categorization                             │
│  • Aggregated sentiment calculation                            │
└─────────────────────────────────────────────────────────────────┘
                               ↕
┌─────────────────────────────────────────────────────────────────┐
│                    Database Layer (SQLite)                      │
│                                                                 │
│  Tables:                                                        │
│  • prices - Historical price data                              │
│  • ohlcv - Candlestick data                                    │
│  • news - News articles with AI analysis                       │
│  • market_sentiment - Sentiment indicators                     │
│  • ai_analysis - AI model outputs                              │
│  • api_cache - Response caching                                │
└─────────────────────────────────────────────────────────────────┘
                               ↕
┌─────────────────────────────────────────────────────────────────┐
│                    Free Data Sources                            │
│                                                                 │
│  Price Sources (NO API KEY):                                   │
│  • CoinCap.io          • CoinGecko (free tier)                 │
│  • Binance Public API  • Kraken Public API                     │
│  • CryptoCompare       • Alternative.me (F&G)                  │
│                                                                 │
│  News Sources (RSS Feeds):                                     │
│  • CoinTelegraph       • CoinDesk                              │
│  • Bitcoin Magazine    • Decrypt                               │
│  • The Block           • CryptoPotato                          │
│  • NewsBTC             • Bitcoinist                            │
└─────────────────────────────────────────────────────────────────┘

📂 Project Structure | ساختار پروژه

crypto_data_bank/
├── __init__.py                 # Package initialization
├── database.py                 # SQLite database layer
├── orchestrator.py             # Data collection orchestrator
├── api_gateway.py              # Main FastAPI gateway
├── requirements.txt            # Python dependencies
│
├── collectors/                 # Data collectors
│   ├── __init__.py
│   ├── free_price_collector.py    # FREE price collection (5 sources)
│   ├── rss_news_collector.py      # RSS news aggregation (8 feeds)
│   └── sentiment_collector.py     # Market sentiment collection
│
└── ai/                         # AI/ML components
    ├── __init__.py
    └── huggingface_models.py      # HuggingFace model integration

🚀 Quick Start | راه‌اندازی سریع

1. Install Dependencies | نصب وابستگی‌ها

cd crypto_data_bank
pip install -r requirements.txt

2. Start the API Gateway | راه‌اندازی API Gateway

python api_gateway.py

The server will start on http://localhost:8888

3. Access the API | دسترسی به API

Interactive Documentation:

Example API Calls:

# Get latest prices
curl http://localhost:8888/api/prices?symbols=BTC,ETH,SOL

# Get crypto news
curl http://localhost:8888/api/news?limit=10

# Get market sentiment
curl http://localhost:8888/api/sentiment

# Get market overview
curl http://localhost:8888/api/market/overview

# Get trending coins
curl http://localhost:8888/api/trending

📊 API Endpoints | نقاط پایانی API

Core Endpoints

GET /

Root endpoint with API information

GET /api/health

Health check and system status

GET /api/stats

Complete database and collection statistics

Price Endpoints

GET /api/prices

Get cryptocurrency prices

Parameters:

  • symbols (optional): Comma-separated symbols (e.g., BTC,ETH,SOL)
  • limit (default: 100): Number of results
  • force_refresh (default: false): Force fresh data collection

Example:

GET /api/prices?symbols=BTC,ETH&limit=10&force_refresh=true

Response:

{
  "success": true,
  "source": "live_collection",
  "count": 2,
  "data": [
    {
      "symbol": "BTC",
      "price": 50000.00,
      "change24h": 2.5,
      "volume24h": 25000000000,
      "marketCap": 980000000000,
      "sources_count": 5,
      "sources": ["coincap", "coingecko", "binance", "kraken", "cryptocompare"]
    }
  ],
  "timestamp": "2024-11-14T10:30:00"
}

GET /api/prices/{symbol}

Get single crypto with price history

Parameters:

  • history_hours (default: 24): Hours of price history

News Endpoints

GET /api/news

Get cryptocurrency news

Parameters:

  • limit (default: 50): Number of news items
  • category (optional): Filter by category
  • coin (optional): Filter by coin symbol
  • force_refresh (default: false): Force fresh collection

Example:

GET /api/news?coin=BTC&limit=20

GET /api/trending

Get trending coins based on news mentions

Sentiment Endpoints

GET /api/sentiment

Get market sentiment analysis

Response:

{
  "success": true,
  "data": {
    "fear_greed": {
      "fear_greed_value": 65,
      "fear_greed_classification": "Greed"
    },
    "btc_dominance": {
      "btc_dominance": 48.5
    },
    "overall_sentiment": {
      "overall_sentiment": "Greed",
      "sentiment_score": 62.5,
      "confidence": 0.85
    }
  }
}

GET /api/market/overview

Complete market overview with prices, sentiment, and news

AI Analysis Endpoints

GET /api/ai/analysis

Get AI analyses from database

Parameters:

  • symbol (optional): Filter by symbol
  • limit (default: 50): Number of results

POST /api/ai/analyze/news

Analyze news sentiment with AI

Parameters:

  • text: News text to analyze

Response:

{
  "success": true,
  "analysis": {
    "sentiment": "bullish",
    "confidence": 0.92,
    "model": "finbert"
  }
}

Collection Control Endpoints

POST /api/collection/start

Start background data collection

POST /api/collection/stop

Stop background data collection

GET /api/collection/status

Get collection status


🤖 HuggingFace AI Models | مدل‌های هوش مصنوعی

FinBERT - Sentiment Analysis

  • Model: ProsusAI/finbert
  • Purpose: Financial sentiment analysis of news
  • Output: bullish / bearish / neutral
  • Use Case: Analyze crypto news sentiment

BART-MNLI - Zero-Shot Classification

  • Model: facebook/bart-large-mnli
  • Purpose: News categorization
  • Categories: price_movement, regulation, technology, adoption, security, defi, nft, etc.
  • Use Case: Automatically categorize news articles

Simple Analyzer (Fallback)

  • Method: Keyword-based sentiment
  • Use Case: When transformers not available
  • Performance: Fast but less accurate

💾 Database Schema | ساختار دیتابیس

prices Table

Stores real-time cryptocurrency prices

Columns:

  • id: Primary key
  • symbol: Crypto symbol (BTC, ETH, etc.)
  • price: Current price in USD
  • change_1h, change_24h, change_7d: Price changes
  • volume_24h: 24-hour trading volume
  • market_cap: Market capitalization
  • rank: Market cap rank
  • source: Data source
  • timestamp: Collection time

news Table

Stores crypto news articles

Columns:

  • id: Primary key
  • title: News title
  • description: News description
  • url: Article URL (unique)
  • source: News source
  • published_at: Publication date
  • sentiment: AI sentiment score
  • coins: Related cryptocurrencies (JSON)
  • category: News category

market_sentiment Table

Stores market sentiment indicators

Columns:

  • fear_greed_value: Fear & Greed Index value (0-100)
  • fear_greed_classification: Classification (Fear/Greed/etc.)
  • overall_sentiment: Calculated overall sentiment
  • sentiment_score: Aggregated sentiment score
  • confidence: Confidence level

ai_analysis Table

Stores AI model analysis results

Columns:

  • symbol: Cryptocurrency symbol
  • analysis_type: Type of analysis
  • model_used: AI model name
  • input_data: Input data (JSON)
  • output_data: Analysis output (JSON)
  • confidence: Confidence score

api_cache Table

Caches API responses for performance

Columns:

  • endpoint: API endpoint
  • params: Request parameters
  • response: Cached response (JSON)
  • ttl: Time to live (seconds)
  • expires_at: Expiration timestamp

🔄 Data Collection Flow | جریان جمع‌آوری داده

Background Collection (Auto-started)

  1. Price Collection (Every 60 seconds)

    • Fetch from 5 free sources simultaneously
    • Aggregate using median price
    • Save to database
    • Cache for fast API responses
  2. News Collection (Every 5 minutes)

    • Fetch from 8 RSS feeds
    • Deduplicate articles
    • Analyze sentiment with AI
    • Extract mentioned coins
    • Save to database
  3. Sentiment Collection (Every 3 minutes)

    • Fetch Fear & Greed Index
    • Calculate BTC dominance
    • Get global market stats
    • Aggregate overall sentiment
    • Save to database

API Request Flow

User Request
     ↓
API Gateway
     ↓
Check Database Cache
     ↓
Cache Hit? → Return Cached Data (Fast!)
     ↓
Cache Miss or force_refresh=true
     ↓
Collect Fresh Data
     ↓
Save to Database
     ↓
Return Fresh Data

📈 Performance | کارایی

Response Times

  • Cached Responses: < 50ms
  • Fresh Price Collection: 2-5 seconds
  • Fresh News Collection: 5-15 seconds
  • AI Analysis: 1-3 seconds per news item

Caching Strategy

  • Default TTL: 60 seconds for prices, 300 seconds for news
  • Database-backed: Persistent across restarts
  • Intelligent Fallback: Serves cached data if live collection fails

Resource Usage

  • Memory: ~200-500 MB (with AI models loaded)
  • CPU: Low (mostly I/O bound)
  • Disk: Grows ~1-5 MB per day (depending on collection frequency)
  • Network: Minimal (all sources are free APIs)

🌐 Data Sources | منابع داده

Price Sources (5 sources, NO API KEY)

Source URL Free Tier Rate Limit Notes
CoinCap coincap.io ✅ Unlimited None Best for market cap data
CoinGecko coingecko.com ✅ Yes 10-30/min Most comprehensive
Binance Public binance.com ✅ Yes 1200/min Real-time prices
Kraken Public kraken.com ✅ Yes 1/sec Reliable exchange data
CryptoCompare cryptocompare.com ✅ Yes 100K/month Good fallback

News Sources (8 sources, RSS feeds)

Source URL Update Frequency Quality
CoinTelegraph cointelegraph.com Every 30 min ⭐⭐⭐⭐⭐
CoinDesk coindesk.com Every hour ⭐⭐⭐⭐⭐
Bitcoin Magazine bitcoinmagazine.com Daily ⭐⭐⭐⭐
Decrypt decrypt.co Every hour ⭐⭐⭐⭐
The Block theblock.co Every hour ⭐⭐⭐⭐⭐
CryptoPotato cryptopotato.com Every 30 min ⭐⭐⭐
NewsBTC newsbtc.com Every hour ⭐⭐⭐
Bitcoinist bitcoinist.com Every hour ⭐⭐⭐

Sentiment Sources (3 sources, FREE)

Source Metric Update Quality
Alternative.me Fear & Greed Index Daily ⭐⭐⭐⭐⭐
CoinCap BTC Dominance Real-time ⭐⭐⭐⭐
CoinGecko Global Market Stats Every 10 min ⭐⭐⭐⭐⭐

🚀 Deployment to HuggingFace Spaces | استقرار در HuggingFace

Prerequisites

  1. HuggingFace account
  2. Git installed
  3. HuggingFace CLI (optional)

Steps

  1. Create New Space

  2. Clone Repository

    git clone https://huggingface.co/spaces/YOUR_USERNAME/crypto-data-bank
    cd crypto-data-bank
    
  3. Copy Files

    cp -r crypto_data_bank/* .
    
  4. Create Dockerfile (See deployment section below)

  5. Push to HuggingFace

    git add .
    git commit -m "Initial deployment"
    git push
    
  6. Configure Space

    • Set port to 8888 in Space settings
    • Enable persistence for database storage
    • Wait for build to complete
  7. Access Your Space


🐳 Docker Deployment | استقرار داکر

Dockerfile:

FROM python:3.10-slim

WORKDIR /app

# Install dependencies
COPY crypto_data_bank/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application
COPY crypto_data_bank/ /app/

# Create data directory
RUN mkdir -p /app/data

# Expose port
EXPOSE 8888

# Run application
CMD ["python", "api_gateway.py"]

Build and Run:

# Build image
docker build -t crypto-data-bank .

# Run container
docker run -p 8888:8888 -v $(pwd)/data:/app/data crypto-data-bank

🧪 Testing | تست

Test Individual Collectors

# Test price collector
python crypto_data_bank/collectors/free_price_collector.py

# Test news collector
python crypto_data_bank/collectors/rss_news_collector.py

# Test sentiment collector
python crypto_data_bank/collectors/sentiment_collector.py

# Test AI models
python crypto_data_bank/ai/huggingface_models.py

# Test orchestrator
python crypto_data_bank/orchestrator.py

Test API Gateway

# Start server
python crypto_data_bank/api_gateway.py

# In another terminal, test endpoints
curl http://localhost:8888/api/health
curl http://localhost:8888/api/prices?symbols=BTC
curl http://localhost:8888/api/news?limit=5

📝 Configuration | پیکربندی

Collection Intervals

Edit in orchestrator.py:

self.intervals = {
    'prices': 60,     # Every 1 minute
    'news': 300,      # Every 5 minutes
    'sentiment': 180, # Every 3 minutes
}

Database Location

Edit in database.py:

def __init__(self, db_path: str = "data/crypto_bank.db"):

API Port

Edit in api_gateway.py:

uvicorn.run(
    "api_gateway:app",
    host="0.0.0.0",
    port=8888,  # Change port here
)

🔒 Security Considerations | ملاحظات امنیتی

No API Keys Stored - All data sources are free and public ✅ Read-Only Operations - Only fetches data, never modifies external sources ✅ Rate Limiting - Respects source rate limits ✅ Input Validation - Pydantic models validate all inputs ✅ SQL Injection Protection - Uses parameterized queries ✅ CORS Enabled - Configure as needed for your use case


🎓 Use Cases | موارد استفاده

1. Trading Bots

Use the API to get real-time prices and sentiment for automated trading

2. Portfolio Trackers

Build a portfolio tracker with historical price data

3. News Aggregators

Create a crypto news dashboard with AI sentiment analysis

4. Market Analysis

Analyze market trends using sentiment and price data

5. Research & Education

Study cryptocurrency market behavior and sentiment correlation


🤝 Contributing | مشارکت

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

📄 License | مجوز

Same as main project


🙏 Acknowledgments | تشکر

Data Sources:

  • CoinCap, CoinGecko, Binance, Kraken, CryptoCompare
  • Alternative.me (Fear & Greed Index)
  • CoinTelegraph, CoinDesk, and other news sources

Technologies:

  • FastAPI - Web framework
  • HuggingFace Transformers - AI models
  • SQLite - Database
  • httpx - HTTP client
  • feedparser - RSS parsing
  • BeautifulSoup - HTML parsing

AI Models:

  • ProsusAI/finbert - Financial sentiment
  • facebook/bart-large-mnli - Classification

📞 Support | پشتیبانی

Documentation: See /docs endpoint when running Issues: Report at GitHub repository Contact: Check main project README


🎉 Status | وضعیت

Version: 1.0.0 Status: ✅ Production Ready Last Updated: 2024-11-14 Deployment: Ready for HuggingFace Spaces


Built with ❤️ for the crypto community

با ❤️ برای جامعه کریپتو ساخته شده