ModernBERT Fine-tuned for Financial Text Sentiment Analysis
This project fine-tunes the ModernBERT model on the FinGPT sentiment dataset for financial text sentiment analysis.
Dataset & Model
- Model: answerdotai/ModernBERT-base
- Dataset: FinGPT/fingpt-sentiment-train
- Task: Multi-class sentiment classification (9 categories)
- Domain: Financial text analysis
ModernBert
ModernBERT is a modernized bidirectional encoder-only Transformer model (BERT-style) pre-trained on 2 trillion tokens of English and code data with a native context length of up to 8,192 tokens. It leverages architectural improvements such as Rotary Positional Embeddings (RoPE) for long-context support, Local-Global Alternating Attention for efficiency on long inputs, Unpadding and Flash Attention for efficient inference.
FinGPT Sentiment Analysis Dataset
Contains 76,772 test rows (17,919,695 tokens)
Sentiment Categories
The model classifies text into 9 fine-grained sentiment levels:
| Label ID | Sentiment Category | Description |
|---|---|---|
| 0 | Strong Negative | Very pessimistic |
| 1 | Moderately Negative | Somewhat pessimistic |
| 2 | Mildly Negative | Slightly pessimistic |
| 3 | Negative | General negative sentiment |
| 4 | Neutral | No clear positive or negative bias |
| 5 | Mildly Positive | Slightly optimistic |
| 6 | Moderately Positive | Somewhat optimistic |
| 7 | Positive | General positive sentiment |
| 8 | Strong Positive | Very optimistic |
Model Configuration
Parameters
- Max Sequence Length: 512 tokens
- Batch Size: 16
- Learning Rate: 2e-5 with warmup
- Epochs: 3 with early stopping
- Optimizer: AdamW with weight decay (0.01)
Features
- Early Stopping: Prevents overfitting (patience=3)
- Best Model Loading: Automatically loads best checkpoint
- Mixed Precision: FP16 training for speed optimization
- Stratified Splitting: 80/20 train/validation split
Evaluation Metrics
- Accuracy: Overall classification accuracy
- F1-Score: Weighted F1-score across all classes
- Precision: Weighted precision
- Recall: Weighted recall
- Confusion Matrix: Visual analysis of classification performance
- Classification Report: Detailed per-class metrics
Performance
Training Time (on T4 GPU)
- Total Training: ~30-45 minutes
- Per Epoch: ~10-15 minutes
- Evaluation: ~2-3 minutes
Training Results (Actual)
- Loss: 0.3741
- Accuracy: 0.9043
- F1: 0.9026
- Precision: 0.9022
- Recall: 0.9043
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
|---|---|---|---|---|---|---|---|
| 0.9551 | 0.1302 | 500 | 0.8504 | 0.6769 | 0.6623 | 0.6589 | 0.6769 |
| 0.6639 | 0.2605 | 1000 | 0.7921 | 0.7162 | 0.6952 | 0.7444 | 0.7162 |
| 0.5221 | 0.3907 | 1500 | 0.5066 | 0.8134 | 0.8083 | 0.8147 | 0.8134 |
| 0.4415 | 0.5210 | 2000 | 0.4247 | 0.8381 | 0.8363 | 0.8410 | 0.8381 |
| 0.4276 | 0.6512 | 2500 | 0.3884 | 0.8594 | 0.8486 | 0.8484 | 0.8594 |
| 0.3767 | 0.7815 | 3000 | 0.3472 | 0.8756 | 0.8661 | 0.8689 | 0.8756 |
| 0.3281 | 0.9117 | 3500 | 0.3463 | 0.8754 | 0.8631 | 0.8611 | 0.8754 |
| 0.2419 | 1.0419 | 4000 | 0.3556 | 0.8883 | 0.8737 | 0.8728 | 0.8883 |
| 0.2859 | 1.1722 | 4500 | 0.3162 | 0.8922 | 0.8859 | 0.8829 | 0.8922 |
| 0.226 | 1.3024 | 5000 | 0.3269 | 0.8914 | 0.8857 | 0.8851 | 0.8914 |
| 0.2378 | 1.4327 | 5500 | 0.3281 | 0.8903 | 0.8834 | 0.8881 | 0.8903 |
| 0.2654 | 1.5629 | 6000 | 0.3038 | 0.8938 | 0.8862 | 0.8896 | 0.8938 |
| 0.2319 | 1.6931 | 6500 | 0.3032 | 0.8993 | 0.8919 | 0.8905 | 0.8993 |
| 0.2116 | 1.8234 | 7000 | 0.3013 | 0.9023 | 0.8919 | 0.8937 | 0.9023 |
| 0.1922 | 1.9536 | 7500 | 0.2959 | 0.9017 | 0.8968 | 0.8941 | 0.9017 |
| 0.1536 | 2.0839 | 8000 | 0.3983 | 0.9009 | 0.8986 | 0.9000 | 0.9009 |
| 0.1438 | 2.2141 | 8500 | 0.3982 | 0.8990 | 0.8968 | 0.8954 | 0.8990 |
| 0.1329 | 2.3444 | 9000 | 0.3809 | 0.9021 | 0.8990 | 0.8968 | 0.9021 |
| 0.1175 | 2.4746 | 9500 | 0.3944 | 0.9019 | 0.8991 | 0.8977 | 0.9019 |
| 0.1634 | 2.6048 | 10000 | 0.3899 | 0.9043 | 0.8999 | 0.8989 | 0.9043 |
| 0.1049 | 2.7351 | 10500 | 0.4006 | 0.9037 | 0.9016 | 0.9009 | 0.9037 |
| 0.1247 | 2.8653 | 11000 | 0.3828 | 0.9053 | 0.9019 | 0.9006 | 0.9053 |
| 0.1511 | 2.9956 | 11500 | 0.3741 | 0.9043 | 0.9026 | 0.9022 | 0.9043 |
Deployment Options
- API Deployment: Create REST API using FastAPI
- Batch Processing: Set up automated sentiment analysis pipeline
- Real-time Analysis: Integrate with financial data streams
References
- Downloads last month
- 37
Model tree for tsphua/modernbert-fingpt
Base model
answerdotai/ModernBERT-base