Datasourceforcryptocurrency / archive /docs /README_HF_INTEGRATION.md
Really-amin's picture
Upload 295 files
d6d843f verified

Hugging Face Integration - Complete

تغییرات انجام شده

1. AI Models - Ensemble Sentiment (ai_models.py)

Model Catalog:

  • ✅ Crypto Sentiment: ElKulako/cryptobert, kk08/CryptoBERT, burakutf/finetuned-finbert-crypto, mathugo/crypto_news_bert
  • ✅ Social Sentiment: svalabs/twitter-xlm-roberta-bitcoin-sentiment, mayurjadhav/crypto-sentiment-model
  • ✅ Financial Sentiment: ProsusAI/finbert, cardiffnlp/twitter-roberta-base-sentiment
  • ✅ News Sentiment: mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis
  • ✅ Decision Models: agarkovv/CryptoTrader-LM

Ensemble Sentiment:

  • ensemble_crypto_sentiment(text) - استفاده از چند model برای sentiment analysis
  • Majority voting برای تعیین label نهایی
  • Confidence scoring مبتنی بر میانگین score ها

2. HF Registry - Dataset Catalog (backend/services/hf_registry.py)

Curated Datasets:

  • Price/OHLCV: 7 datasets (Bitcoin, Ethereum, XRP price data)
  • News Raw: 2 datasets (crypto news headlines)
  • News Labeled: 5 datasets (news with sentiment/impact labels)

Features:

  • Category-based organization
  • Automatic refresh from HF Hub
  • Metadata (likes, downloads, tags)

3. Unified Server - Complete API (hf_unified_server.py)

New Endpoints:

Health & Status:

  • GET /api/health - Dashboard health check

Market Data:

  • GET /api/coins/top?limit=10 - Top coins by market cap
  • GET /api/coins/{symbol} - Coin details
  • GET /api/market/stats - Global market stats
  • GET /api/charts/price/{symbol}?timeframe=7d - Price chart
  • POST /api/charts/analyze - Chart analysis with AI

News & AI:

  • GET /api/news/latest?limit=40 - News with sentiment
  • POST /api/news/summarize - Summarize article
  • POST /api/sentiment/analyze - Sentiment analysis
  • POST /api/query - Natural language query

Datasets & Models:

  • GET /api/datasets/list - Available datasets
  • GET /api/datasets/sample?name=... - Dataset sample
  • GET /api/models/list - Available models
  • POST /api/models/test - Test model

Real-time:

  • WS /ws - WebSocket for live updates (market + news + sentiment)

4. Frontend Compatibility

admin.html + static/js/

  • ✅ Tمام endpoint های مورد نیاز پیاده شده
  • ✅ WebSocket support
  • ✅ Sentiment از ensemble models
  • ✅ Real-time updates هر 10 ثانیه

نحوه استفاده

Docker (HuggingFace Space)

docker build -t crypto-hf .
docker run -p 7860:7860 -e HF_TOKEN=your_token crypto-hf

مستقیم

pip install -r requirements.txt
export HF_TOKEN=your_token
uvicorn hf_unified_server:app --host 0.0.0.0 --port 7860

تست

# Health check
curl http://localhost:7860/api/health

# Top coins
curl http://localhost:7860/api/coins/top?limit=10

# Sentiment analysis
curl -X POST http://localhost:7860/api/sentiment/analyze \
  -H "Content-Type: application/json" \
  -d '{"text": "Bitcoin price surging to new heights!"}'

# Models list
curl http://localhost:7860/api/models/list

# Datasets list
curl http://localhost:7860/api/datasets/list

Model Usage

Ensemble sentiment در action:

from ai_models import ensemble_crypto_sentiment

result = ensemble_crypto_sentiment("Bitcoin breaking resistance!")
# {
#   "label": "bullish",
#   "confidence": 0.87,
#   "scores": {
#     "ElKulako/cryptobert": {"label": "bullish", "score": 0.92},
#     "kk08/CryptoBERT": {"label": "bullish", "score": 0.82}
#   },
#   "model_count": 2
# }

Dependencies

requirements.txt includes:

  • transformers>=4.36.0
  • datasets>=2.16.0
  • huggingface-hub>=0.19.0
  • torch>=2.0.0

Environment Variables

HF_TOKEN=hf_your_token_here  # برای private models

چک لیست تست

  • /api/health - Status OK
  • /api/coins/top - Top 10 coins
  • /api/market/stats - Market data
  • /api/news/latest - News با sentiment
  • /api/sentiment/analyze - Ensemble working
  • /api/models/list - 10+ models listed
  • /api/datasets/list - 14+ datasets listed
  • /ws - WebSocket live updates
  • Dashboard UI - All tabs working

توجه

  • Models به صورت lazy-load می‌شوند (اولین استفاده)
  • Ensemble sentiment از 2-3 model استفاده می‌کند برای سرعت
  • Dataset sampling نیاز به authentication دارد برای بعضی datasets
  • CryptoTrader-LM model بزرگ است (7B) - فقط با GPU

Support

All endpoints from the requirements document are implemented and tested. Frontend (admin.html) works without 404/403 errors.