ModernBERT Fine-tuned for Financial Text Sentiment Analysis

This project fine-tunes the ModernBERT model on the FinGPT sentiment dataset for financial text sentiment analysis.

Dataset & Model

ModernBert

ModernBERT is a modernized bidirectional encoder-only Transformer model (BERT-style) pre-trained on 2 trillion tokens of English and code data with a native context length of up to 8,192 tokens. It leverages architectural improvements such as Rotary Positional Embeddings (RoPE) for long-context support, Local-Global Alternating Attention for efficiency on long inputs, Unpadding and Flash Attention for efficient inference.

FinGPT Sentiment Analysis Dataset

Contains 76,772 test rows (17,919,695 tokens)

Sentiment Categories

The model classifies text into 9 fine-grained sentiment levels:

Label ID Sentiment Category Description
0 Strong Negative Very pessimistic
1 Moderately Negative Somewhat pessimistic
2 Mildly Negative Slightly pessimistic
3 Negative General negative sentiment
4 Neutral No clear positive or negative bias
5 Mildly Positive Slightly optimistic
6 Moderately Positive Somewhat optimistic
7 Positive General positive sentiment
8 Strong Positive Very optimistic

Model Configuration

Parameters

  • Max Sequence Length: 512 tokens
  • Batch Size: 16
  • Learning Rate: 2e-5 with warmup
  • Epochs: 3 with early stopping
  • Optimizer: AdamW with weight decay (0.01)

Features

  • Early Stopping: Prevents overfitting (patience=3)
  • Best Model Loading: Automatically loads best checkpoint
  • Mixed Precision: FP16 training for speed optimization
  • Stratified Splitting: 80/20 train/validation split

Evaluation Metrics

  • Accuracy: Overall classification accuracy
  • F1-Score: Weighted F1-score across all classes
  • Precision: Weighted precision
  • Recall: Weighted recall
  • Confusion Matrix: Visual analysis of classification performance
  • Classification Report: Detailed per-class metrics

Performance

Training Time (on T4 GPU)

  • Total Training: ~30-45 minutes
  • Per Epoch: ~10-15 minutes
  • Evaluation: ~2-3 minutes

Training Results (Actual)

  • Loss: 0.3741
  • Accuracy: 0.9043
  • F1: 0.9026
  • Precision: 0.9022
  • Recall: 0.9043
Training Loss Epoch Step Validation Loss Accuracy F1 Precision Recall
0.9551 0.1302 500 0.8504 0.6769 0.6623 0.6589 0.6769
0.6639 0.2605 1000 0.7921 0.7162 0.6952 0.7444 0.7162
0.5221 0.3907 1500 0.5066 0.8134 0.8083 0.8147 0.8134
0.4415 0.5210 2000 0.4247 0.8381 0.8363 0.8410 0.8381
0.4276 0.6512 2500 0.3884 0.8594 0.8486 0.8484 0.8594
0.3767 0.7815 3000 0.3472 0.8756 0.8661 0.8689 0.8756
0.3281 0.9117 3500 0.3463 0.8754 0.8631 0.8611 0.8754
0.2419 1.0419 4000 0.3556 0.8883 0.8737 0.8728 0.8883
0.2859 1.1722 4500 0.3162 0.8922 0.8859 0.8829 0.8922
0.226 1.3024 5000 0.3269 0.8914 0.8857 0.8851 0.8914
0.2378 1.4327 5500 0.3281 0.8903 0.8834 0.8881 0.8903
0.2654 1.5629 6000 0.3038 0.8938 0.8862 0.8896 0.8938
0.2319 1.6931 6500 0.3032 0.8993 0.8919 0.8905 0.8993
0.2116 1.8234 7000 0.3013 0.9023 0.8919 0.8937 0.9023
0.1922 1.9536 7500 0.2959 0.9017 0.8968 0.8941 0.9017
0.1536 2.0839 8000 0.3983 0.9009 0.8986 0.9000 0.9009
0.1438 2.2141 8500 0.3982 0.8990 0.8968 0.8954 0.8990
0.1329 2.3444 9000 0.3809 0.9021 0.8990 0.8968 0.9021
0.1175 2.4746 9500 0.3944 0.9019 0.8991 0.8977 0.9019
0.1634 2.6048 10000 0.3899 0.9043 0.8999 0.8989 0.9043
0.1049 2.7351 10500 0.4006 0.9037 0.9016 0.9009 0.9037
0.1247 2.8653 11000 0.3828 0.9053 0.9019 0.9006 0.9053
0.1511 2.9956 11500 0.3741 0.9043 0.9026 0.9022 0.9043

Deployment Options

  • API Deployment: Create REST API using FastAPI
  • Batch Processing: Set up automated sentiment analysis pipeline
  • Real-time Analysis: Integrate with financial data streams

References

Downloads last month
37
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tsphua/modernbert-fingpt

Finetuned
(843)
this model

Dataset used to train tsphua/modernbert-fingpt