Cryptocurrency Data Aggregator - Complete Rewrite
A production-ready cryptocurrency data aggregation application with AI-powered analysis, real-time data collection, and an interactive Gradio dashboard.
Features
Core Capabilities
- Real-time Price Tracking: Monitor top 100 cryptocurrencies with live updates
- AI-Powered Sentiment Analysis: Using HuggingFace models for news sentiment
- Market Analysis: Technical indicators (MA, RSI), trend detection, predictions
- News Aggregation: RSS feeds from CoinDesk, Cointelegraph, Bitcoin.com, and Reddit
- Interactive Dashboard: 6-tab Gradio interface with auto-refresh
- SQLite Database: Persistent storage with full CRUD operations
- No API Keys Required: Uses only free data sources
Data Sources (All Free, No Authentication)
- CoinGecko API: Market data, prices, rankings
- CoinCap API: Backup price data source
- Binance Public API: Real-time trading data
- Alternative.me: Fear & Greed Index
- RSS Feeds: CoinDesk, Cointelegraph, Bitcoin Magazine, Decrypt, Bitcoinist
- Reddit: r/cryptocurrency, r/bitcoin, r/ethtrader, r/cryptomarkets
AI Models (HuggingFace - Local Inference)
- cardiffnlp/twitter-roberta-base-sentiment-latest: Social media sentiment
- ProsusAI/finbert: Financial news sentiment
- facebook/bart-large-cnn: News summarization
Project Structure
crypto-dt-source/
βββ config.py # Configuration constants
βββ database.py # SQLite database with CRUD operations
βββ collectors.py # Data collection from all sources
βββ ai_models.py # HuggingFace model integration
βββ utils.py # Helper functions and utilities
βββ app.py # Main Gradio application
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ data/
β βββ database/ # SQLite database files
β βββ backups/ # Database backups
βββ logs/
βββ crypto_aggregator.log # Application logs
Installation
Prerequisites
- Python 3.8 or higher
- 4GB+ RAM (for AI models)
- Internet connection
Step 1: Clone Repository
git clone <repository-url>
cd crypto-dt-source
Step 2: Install Dependencies
pip install -r requirements.txt
This will install:
- Gradio (web interface)
- Pandas, NumPy (data processing)
- Transformers, PyTorch (AI models)
- Plotly (charts)
- BeautifulSoup4, Feedparser (web scraping)
- And more...
Step 3: Run Application
python app.py
The application will:
- Initialize the SQLite database
- Load AI models (first run may take 2-3 minutes)
- Start background data collection
- Launch Gradio interface
Access the dashboard at: http://localhost:7860
Gradio Dashboard
Tab 1: Live Dashboard π
- Top 100 cryptocurrencies with real-time prices
- Columns: Rank, Name, Symbol, Price, 24h Change, Volume, Market Cap
- Auto-refresh every 30 seconds
- Search and filter functionality
- Color-coded price changes (green/red)
Tab 2: Historical Charts π
- Select any cryptocurrency
- Choose timeframe: 1d, 7d, 30d, 90d, 1y, All
- Interactive Plotly charts with:
- Price line chart
- Volume bars
- MA(7) and MA(30) overlays
- RSI indicator
- Export charts as PNG
Tab 3: News & Sentiment π°
- Latest cryptocurrency news from 9+ sources
- Filter by sentiment: All, Positive, Neutral, Negative
- Filter by coin: BTC, ETH, etc.
- Each article shows:
- Title (clickable link)
- Source and date
- AI-generated sentiment score
- Summary
- Related coins
- Market sentiment gauge (0-100 scale)
Tab 4: AI Analysis π€
- Select cryptocurrency
- Generate AI-powered analysis:
- Current trend (Bullish/Bearish/Neutral)
- Support/Resistance levels
- Technical indicators (RSI, MA7, MA30)
- 24-72h prediction
- Confidence score
- Analysis saved to database for history
Tab 5: Database Explorer ποΈ
- Pre-built SQL queries:
- Top 10 gainers in last 24h
- All positive sentiment news
- Price history for any coin
- Database statistics
- Custom SQL query support (read-only for security)
- Export results to CSV
Tab 6: Data Sources Status π
- Real-time status monitoring:
- CoinGecko API β
- CoinCap API β
- Binance API β
- RSS feeds (5 sources) β
- Reddit endpoints (4 subreddits) β
- Database connection β
- Shows: Status (π’/π΄), Last Update, Error Count
- Manual refresh and data collection controls
- Error log viewer
Database Schema
prices Table
id: Primary keysymbol: Coin symbol (e.g., "bitcoin")name: Full name (e.g., "Bitcoin")price_usd: Current price in USDvolume_24h: 24-hour trading volumemarket_cap: Market capitalizationpercent_change_1h,percent_change_24h,percent_change_7d: Price changesrank: Market cap ranktimestamp: Record timestamp
news Table
id: Primary keytitle: News article titlesummary: AI-generated summaryurl: Article URL (unique)source: Source name (e.g., "CoinDesk")sentiment_score: Float (-1 to 1)sentiment_label: Label (positive/negative/neutral)related_coins: JSON array of coin symbolspublished_date: Original publication datetimestamp: Record timestamp
market_analysis Table
id: Primary keysymbol: Coin symboltimeframe: Analysis periodtrend: Trend direction (Bullish/Bearish/Neutral)support_level,resistance_level: Price levelsprediction: Text predictionconfidence: Confidence score (0-1)timestamp: Analysis timestamp
user_queries Table
id: Primary keyquery: SQL query or search termresult_count: Number of resultstimestamp: Query timestamp
Configuration
Edit config.py to customize:
# Data collection intervals
COLLECTION_INTERVALS = {
"price_data": 300, # 5 minutes
"news_data": 1800, # 30 minutes
"sentiment_data": 1800 # 30 minutes
}
# Number of coins to track
TOP_COINS_LIMIT = 100
# Gradio settings
GRADIO_SERVER_PORT = 7860
AUTO_REFRESH_INTERVAL = 30 # seconds
# Cache settings
CACHE_TTL = 300 # 5 minutes
CACHE_MAX_SIZE = 1000
# Logging
LOG_LEVEL = "INFO"
LOG_FILE = "logs/crypto_aggregator.log"
API Usage Examples
Collect Data Manually
from collectors import collect_price_data, collect_news_data
# Collect latest prices
success, count = collect_price_data()
print(f"Collected {count} prices")
# Collect news
count = collect_news_data()
print(f"Collected {count} articles")
Query Database
from database import get_database
db = get_database()
# Get latest prices
prices = db.get_latest_prices(limit=10)
# Get news by coin
news = db.get_news_by_coin("bitcoin", limit=5)
# Get top gainers
gainers = db.get_top_gainers(limit=10)
AI Analysis
from ai_models import analyze_sentiment, analyze_market_trend
from database import get_database
# Analyze sentiment
result = analyze_sentiment("Bitcoin hits new all-time high!")
print(result) # {'label': 'positive', 'score': 0.95, 'confidence': 0.92}
# Analyze market trend
db = get_database()
history = db.get_price_history("bitcoin", hours=168)
analysis = analyze_market_trend(history)
print(analysis) # {'trend': 'Bullish', 'support_level': 50000, ...}
Error Handling & Resilience
Fallback Mechanisms
- If CoinGecko fails β CoinCap is used
- If both APIs fail β cached database data is used
- If AI models fail to load β keyword-based sentiment analysis
- All network requests have timeout and retry logic
Data Validation
- Price bounds checking (MIN_PRICE to MAX_PRICE)
- Volume and market cap validation
- Duplicate prevention (unique URLs for news)
- SQL injection prevention (read-only queries only)
Logging
All operations are logged to logs/crypto_aggregator.log:
- Info: Successful operations, data collection
- Warning: API failures, retries
- Error: Database errors, critical failures
Performance Optimization
- Async/Await: All network requests use aiohttp
- Connection Pooling: Reused HTTP connections
- Caching: In-memory cache with 5-minute TTL
- Batch Inserts: Minimum 100 records per database insert
- Indexed Queries: Database indexes on symbol, timestamp, sentiment
- Lazy Loading: AI models load only when first used
Troubleshooting
Issue: Models won't load
Solution: Ensure you have 4GB+ RAM. Models download on first run (2-3 min).
Issue: No data appearing
Solution: Wait 5 minutes for initial data collection, or click "Refresh" buttons.
Issue: Port 7860 already in use
Solution: Change GRADIO_SERVER_PORT in config.py or kill existing process.
Issue: Database locked
Solution: Only one process can write at a time. Close other instances.
Issue: RSS feeds failing
Solution: Some feeds may be temporarily down. Check Tab 6 for status.
Development
Running Tests
# Test data collection
python collectors.py
# Test AI models
python ai_models.py
# Test utilities
python utils.py
# Test database
python database.py
Adding New Data Sources
Edit collectors.py:
def collect_new_source():
try:
response = safe_api_call("https://api.example.com/data")
# Parse and save data
return True
except Exception as e:
logger.error(f"Error: {e}")
return False
Add to scheduler in collectors.py:
# In schedule_data_collection()
threading.Timer(interval, collect_new_source).start()
Validation Checklist
- All 8 files complete
- No TODO or FIXME comments
- No placeholder functions
- All imports in requirements.txt
- Database schema matches specification
- All 6 Gradio tabs implemented
- All 3 AI models integrated
- All 5+ data sources configured
- Error handling in every network call
- Logging for all major operations
- No API keys in code
- Comments in English
- PEP 8 compliant
License
MIT License - Free to use, modify, and distribute.
Support
For issues or questions:
- Check logs:
logs/crypto_aggregator.log - Review error messages in Tab 6
- Ensure all dependencies installed:
pip install -r requirements.txt
Credits
- Data Sources: CoinGecko, CoinCap, Binance, Alternative.me, CoinDesk, Cointelegraph, Reddit
- AI Models: HuggingFace (Cardiff NLP, ProsusAI, Facebook)
- Framework: Gradio
Made with β€οΈ for the Crypto Community