# Hugging Face Integration - Complete ## تغییرات انجام شده ### 1. AI Models - Ensemble Sentiment (`ai_models.py`) **Model Catalog:** - ✅ Crypto Sentiment: ElKulako/cryptobert, kk08/CryptoBERT, burakutf/finetuned-finbert-crypto, mathugo/crypto_news_bert - ✅ Social Sentiment: svalabs/twitter-xlm-roberta-bitcoin-sentiment, mayurjadhav/crypto-sentiment-model - ✅ Financial Sentiment: ProsusAI/finbert, cardiffnlp/twitter-roberta-base-sentiment - ✅ News Sentiment: mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis - ✅ Decision Models: agarkovv/CryptoTrader-LM **Ensemble Sentiment:** - `ensemble_crypto_sentiment(text)` - استفاده از چند model برای sentiment analysis - Majority voting برای تعیین label نهایی - Confidence scoring مبتنی بر میانگین score ها ### 2. HF Registry - Dataset Catalog (`backend/services/hf_registry.py`) **Curated Datasets:** - **Price/OHLCV**: 7 datasets (Bitcoin, Ethereum, XRP price data) - **News Raw**: 2 datasets (crypto news headlines) - **News Labeled**: 5 datasets (news with sentiment/impact labels) **Features:** - Category-based organization - Automatic refresh from HF Hub - Metadata (likes, downloads, tags) ### 3. Unified Server - Complete API (`hf_unified_server.py`) **New Endpoints:** **Health & Status:** - `GET /api/health` - Dashboard health check **Market Data:** - `GET /api/coins/top?limit=10` - Top coins by market cap - `GET /api/coins/{symbol}` - Coin details - `GET /api/market/stats` - Global market stats - `GET /api/charts/price/{symbol}?timeframe=7d` - Price chart - `POST /api/charts/analyze` - Chart analysis with AI **News & AI:** - `GET /api/news/latest?limit=40` - News with sentiment - `POST /api/news/summarize` - Summarize article - `POST /api/sentiment/analyze` - Sentiment analysis - `POST /api/query` - Natural language query **Datasets & Models:** - `GET /api/datasets/list` - Available datasets - `GET /api/datasets/sample?name=...` - Dataset sample - `GET /api/models/list` - Available models - `POST /api/models/test` - Test model **Real-time:** - `WS /ws` - WebSocket for live updates (market + news + sentiment) ### 4. Frontend Compatibility **admin.html + static/js/** - ✅ Tمام endpoint های مورد نیاز پیاده شده - ✅ WebSocket support - ✅ Sentiment از ensemble models - ✅ Real-time updates هر 10 ثانیه ## نحوه استفاده ### Docker (HuggingFace Space) ```bash docker build -t crypto-hf . docker run -p 7860:7860 -e HF_TOKEN=your_token crypto-hf ``` ### مستقیم ```bash pip install -r requirements.txt export HF_TOKEN=your_token uvicorn hf_unified_server:app --host 0.0.0.0 --port 7860 ``` ### تست ```bash # Health check curl http://localhost:7860/api/health # Top coins curl http://localhost:7860/api/coins/top?limit=10 # Sentiment analysis curl -X POST http://localhost:7860/api/sentiment/analyze \ -H "Content-Type: application/json" \ -d '{"text": "Bitcoin price surging to new heights!"}' # Models list curl http://localhost:7860/api/models/list # Datasets list curl http://localhost:7860/api/datasets/list ``` ## Model Usage Ensemble sentiment در action: ```python from ai_models import ensemble_crypto_sentiment result = ensemble_crypto_sentiment("Bitcoin breaking resistance!") # { # "label": "bullish", # "confidence": 0.87, # "scores": { # "ElKulako/cryptobert": {"label": "bullish", "score": 0.92}, # "kk08/CryptoBERT": {"label": "bullish", "score": 0.82} # }, # "model_count": 2 # } ``` ## Dependencies requirements.txt includes: - transformers>=4.36.0 - datasets>=2.16.0 - huggingface-hub>=0.19.0 - torch>=2.0.0 ## Environment Variables ```.env HF_TOKEN=hf_your_token_here # برای private models ``` ## چک لیست تست - [x] `/api/health` - Status OK - [x] `/api/coins/top` - Top 10 coins - [x] `/api/market/stats` - Market data - [x] `/api/news/latest` - News با sentiment - [x] `/api/sentiment/analyze` - Ensemble working - [x] `/api/models/list` - 10+ models listed - [x] `/api/datasets/list` - 14+ datasets listed - [x] `/ws` - WebSocket live updates - [x] Dashboard UI - All tabs working ## توجه - Models به صورت lazy-load می‌شوند (اولین استفاده) - Ensemble sentiment از 2-3 model استفاده می‌کند برای سرعت - Dataset sampling نیاز به authentication دارد برای بعضی datasets - CryptoTrader-LM model بزرگ است (7B) - فقط با GPU ## Support All endpoints from the requirements document are implemented and tested. Frontend (admin.html) works without 404/403 errors.