Really-amin's picture
Upload 317 files
eebf5c4 verified

Cryptocurrency Data Aggregator - Complete Rewrite

A production-ready cryptocurrency data aggregation application with AI-powered analysis, real-time data collection, and an interactive Gradio dashboard.

Features

Core Capabilities

  • Real-time Price Tracking: Monitor top 100 cryptocurrencies with live updates
  • AI-Powered Sentiment Analysis: Using HuggingFace models for news sentiment
  • Market Analysis: Technical indicators (MA, RSI), trend detection, predictions
  • News Aggregation: RSS feeds from CoinDesk, Cointelegraph, Bitcoin.com, and Reddit
  • Interactive Dashboard: 6-tab Gradio interface with auto-refresh
  • SQLite Database: Persistent storage with full CRUD operations
  • No API Keys Required: Uses only free data sources

Data Sources (All Free, No Authentication)

  • CoinGecko API: Market data, prices, rankings
  • CoinCap API: Backup price data source
  • Binance Public API: Real-time trading data
  • Alternative.me: Fear & Greed Index
  • RSS Feeds: CoinDesk, Cointelegraph, Bitcoin Magazine, Decrypt, Bitcoinist
  • Reddit: r/cryptocurrency, r/bitcoin, r/ethtrader, r/cryptomarkets

AI Models (HuggingFace - Local Inference)

  • cardiffnlp/twitter-roberta-base-sentiment-latest: Social media sentiment
  • ProsusAI/finbert: Financial news sentiment
  • facebook/bart-large-cnn: News summarization

Project Structure

crypto-dt-source/
β”œβ”€β”€ config.py          # Configuration constants
β”œβ”€β”€ database.py        # SQLite database with CRUD operations
β”œβ”€β”€ collectors.py      # Data collection from all sources
β”œβ”€β”€ ai_models.py       # HuggingFace model integration
β”œβ”€β”€ utils.py           # Helper functions and utilities
β”œβ”€β”€ app.py             # Main Gradio application
β”œβ”€β”€ requirements.txt   # Python dependencies
β”œβ”€β”€ README.md          # This file
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ database/      # SQLite database files
β”‚   └── backups/       # Database backups
└── logs/
    └── crypto_aggregator.log  # Application logs

Installation

Prerequisites

  • Python 3.8 or higher
  • 4GB+ RAM (for AI models)
  • Internet connection

Step 1: Clone Repository

git clone <repository-url>
cd crypto-dt-source

Step 2: Install Dependencies

pip install -r requirements.txt

This will install:

  • Gradio (web interface)
  • Pandas, NumPy (data processing)
  • Transformers, PyTorch (AI models)
  • Plotly (charts)
  • BeautifulSoup4, Feedparser (web scraping)
  • And more...

Step 3: Run Application

python app.py

The application will:

  1. Initialize the SQLite database
  2. Load AI models (first run may take 2-3 minutes)
  3. Start background data collection
  4. Launch Gradio interface

Access the dashboard at: http://localhost:7860

Gradio Dashboard

Tab 1: Live Dashboard πŸ“Š

  • Top 100 cryptocurrencies with real-time prices
  • Columns: Rank, Name, Symbol, Price, 24h Change, Volume, Market Cap
  • Auto-refresh every 30 seconds
  • Search and filter functionality
  • Color-coded price changes (green/red)

Tab 2: Historical Charts πŸ“ˆ

  • Select any cryptocurrency
  • Choose timeframe: 1d, 7d, 30d, 90d, 1y, All
  • Interactive Plotly charts with:
    • Price line chart
    • Volume bars
    • MA(7) and MA(30) overlays
    • RSI indicator
  • Export charts as PNG

Tab 3: News & Sentiment πŸ“°

  • Latest cryptocurrency news from 9+ sources
  • Filter by sentiment: All, Positive, Neutral, Negative
  • Filter by coin: BTC, ETH, etc.
  • Each article shows:
    • Title (clickable link)
    • Source and date
    • AI-generated sentiment score
    • Summary
    • Related coins
  • Market sentiment gauge (0-100 scale)

Tab 4: AI Analysis πŸ€–

  • Select cryptocurrency
  • Generate AI-powered analysis:
    • Current trend (Bullish/Bearish/Neutral)
    • Support/Resistance levels
    • Technical indicators (RSI, MA7, MA30)
    • 24-72h prediction
    • Confidence score
  • Analysis saved to database for history

Tab 5: Database Explorer πŸ—„οΈ

  • Pre-built SQL queries:
    • Top 10 gainers in last 24h
    • All positive sentiment news
    • Price history for any coin
    • Database statistics
  • Custom SQL query support (read-only for security)
  • Export results to CSV

Tab 6: Data Sources Status πŸ”

  • Real-time status monitoring:
    • CoinGecko API βœ“
    • CoinCap API βœ“
    • Binance API βœ“
    • RSS feeds (5 sources) βœ“
    • Reddit endpoints (4 subreddits) βœ“
    • Database connection βœ“
  • Shows: Status (🟒/πŸ”΄), Last Update, Error Count
  • Manual refresh and data collection controls
  • Error log viewer

Database Schema

prices Table

  • id: Primary key
  • symbol: Coin symbol (e.g., "bitcoin")
  • name: Full name (e.g., "Bitcoin")
  • price_usd: Current price in USD
  • volume_24h: 24-hour trading volume
  • market_cap: Market capitalization
  • percent_change_1h, percent_change_24h, percent_change_7d: Price changes
  • rank: Market cap rank
  • timestamp: Record timestamp

news Table

  • id: Primary key
  • title: News article title
  • summary: AI-generated summary
  • url: Article URL (unique)
  • source: Source name (e.g., "CoinDesk")
  • sentiment_score: Float (-1 to 1)
  • sentiment_label: Label (positive/negative/neutral)
  • related_coins: JSON array of coin symbols
  • published_date: Original publication date
  • timestamp: Record timestamp

market_analysis Table

  • id: Primary key
  • symbol: Coin symbol
  • timeframe: Analysis period
  • trend: Trend direction (Bullish/Bearish/Neutral)
  • support_level, resistance_level: Price levels
  • prediction: Text prediction
  • confidence: Confidence score (0-1)
  • timestamp: Analysis timestamp

user_queries Table

  • id: Primary key
  • query: SQL query or search term
  • result_count: Number of results
  • timestamp: Query timestamp

Configuration

Edit config.py to customize:

# Data collection intervals
COLLECTION_INTERVALS = {
    "price_data": 300,     # 5 minutes
    "news_data": 1800,     # 30 minutes
    "sentiment_data": 1800 # 30 minutes
}

# Number of coins to track
TOP_COINS_LIMIT = 100

# Gradio settings
GRADIO_SERVER_PORT = 7860
AUTO_REFRESH_INTERVAL = 30  # seconds

# Cache settings
CACHE_TTL = 300  # 5 minutes
CACHE_MAX_SIZE = 1000

# Logging
LOG_LEVEL = "INFO"
LOG_FILE = "logs/crypto_aggregator.log"

API Usage Examples

Collect Data Manually

from collectors import collect_price_data, collect_news_data

# Collect latest prices
success, count = collect_price_data()
print(f"Collected {count} prices")

# Collect news
count = collect_news_data()
print(f"Collected {count} articles")

Query Database

from database import get_database

db = get_database()

# Get latest prices
prices = db.get_latest_prices(limit=10)

# Get news by coin
news = db.get_news_by_coin("bitcoin", limit=5)

# Get top gainers
gainers = db.get_top_gainers(limit=10)

AI Analysis

from ai_models import analyze_sentiment, analyze_market_trend
from database import get_database

# Analyze sentiment
result = analyze_sentiment("Bitcoin hits new all-time high!")
print(result)  # {'label': 'positive', 'score': 0.95, 'confidence': 0.92}

# Analyze market trend
db = get_database()
history = db.get_price_history("bitcoin", hours=168)
analysis = analyze_market_trend(history)
print(analysis)  # {'trend': 'Bullish', 'support_level': 50000, ...}

Error Handling & Resilience

Fallback Mechanisms

  • If CoinGecko fails β†’ CoinCap is used
  • If both APIs fail β†’ cached database data is used
  • If AI models fail to load β†’ keyword-based sentiment analysis
  • All network requests have timeout and retry logic

Data Validation

  • Price bounds checking (MIN_PRICE to MAX_PRICE)
  • Volume and market cap validation
  • Duplicate prevention (unique URLs for news)
  • SQL injection prevention (read-only queries only)

Logging

All operations are logged to logs/crypto_aggregator.log:

  • Info: Successful operations, data collection
  • Warning: API failures, retries
  • Error: Database errors, critical failures

Performance Optimization

  • Async/Await: All network requests use aiohttp
  • Connection Pooling: Reused HTTP connections
  • Caching: In-memory cache with 5-minute TTL
  • Batch Inserts: Minimum 100 records per database insert
  • Indexed Queries: Database indexes on symbol, timestamp, sentiment
  • Lazy Loading: AI models load only when first used

Troubleshooting

Issue: Models won't load

Solution: Ensure you have 4GB+ RAM. Models download on first run (2-3 min).

Issue: No data appearing

Solution: Wait 5 minutes for initial data collection, or click "Refresh" buttons.

Issue: Port 7860 already in use

Solution: Change GRADIO_SERVER_PORT in config.py or kill existing process.

Issue: Database locked

Solution: Only one process can write at a time. Close other instances.

Issue: RSS feeds failing

Solution: Some feeds may be temporarily down. Check Tab 6 for status.

Development

Running Tests

# Test data collection
python collectors.py

# Test AI models
python ai_models.py

# Test utilities
python utils.py

# Test database
python database.py

Adding New Data Sources

Edit collectors.py:

def collect_new_source():
    try:
        response = safe_api_call("https://api.example.com/data")
        # Parse and save data
        return True
    except Exception as e:
        logger.error(f"Error: {e}")
        return False

Add to scheduler in collectors.py:

# In schedule_data_collection()
threading.Timer(interval, collect_new_source).start()

Validation Checklist

  • All 8 files complete
  • No TODO or FIXME comments
  • No placeholder functions
  • All imports in requirements.txt
  • Database schema matches specification
  • All 6 Gradio tabs implemented
  • All 3 AI models integrated
  • All 5+ data sources configured
  • Error handling in every network call
  • Logging for all major operations
  • No API keys in code
  • Comments in English
  • PEP 8 compliant

License

MIT License - Free to use, modify, and distribute.

Support

For issues or questions:

  • Check logs: logs/crypto_aggregator.log
  • Review error messages in Tab 6
  • Ensure all dependencies installed: pip install -r requirements.txt

Credits

  • Data Sources: CoinGecko, CoinCap, Binance, Alternative.me, CoinDesk, Cointelegraph, Reddit
  • AI Models: HuggingFace (Cardiff NLP, ProsusAI, Facebook)
  • Framework: Gradio

Made with ❀️ for the Crypto Community