FinGPT Compliance Agents
A specialized language model for financial compliance and regulatory tasks, fine-tuned on SEC filings analysis, regulatory compliance, sentiment analysis, and XBRL data processing.
Model Details
Model Description
FinGPT Compliance Agents is a LoRA fine-tuned version of Llama-3.2-1B-Instruct, specifically designed for financial compliance and regulatory tasks. The model excels at:
SEC Filings Analysis: Extract insights from SEC filings and XBRL data processing
Financial Q&A: Answer questions about company filings and financial statements
Sentiment Analysis: Classify financial text sentiment with high accuracy
XBRL Processing: Extract tags, values, and construct formulas from XBRL data
Regulatory Compliance: Handle real-time financial data retrieval and analysis
Developed by: SecureFinAI Contest 2025 - Task 2 Team
Model type: Causal Language Model with LoRA adaptation
Language(s) (NLP): English (primary), Russian (audio processing)
License: Apache 2.0
Finetuned from model: meta-llama/Llama-3.2-1B-Instruct
Model Sources
- Repository: GitHub Repository
- Base Model: meta-llama/Llama-3.2-1B-Instruct
- Training Data: FinanceBench, XBRL Analysis, Financial Sentiment datasets
Uses
Direct Use
This model is designed for direct use in financial compliance applications:
- Financial Q&A Systems: Answer questions about company filings and financial data
- Sentiment Analysis: Classify financial news, earnings calls, and market sentiment
- XBRL Data Processing: Extract and analyze structured financial data
- Regulatory Compliance: Process SEC filings and regulatory documents
- Audio Processing: Transcribe and analyze financial audio content
Downstream Use
The model can be further fine-tuned for specific financial domains:
- Banking Compliance: Anti-money laundering, fraud detection
- Insurance: Risk assessment, claims processing
- Investment Analysis: Portfolio management, risk evaluation
- Regulatory Reporting: Automated compliance reporting
Out-of-Scope Use
This model should not be used for:
- Financial advice or investment recommendations
- Legal advice or regulatory interpretation
- High-stakes financial decisions without human oversight
- Non-financial compliance tasks
Bias, Risks, and Limitations
Known Limitations
- Model Size: Limited to 1B parameters, may not capture complex financial relationships
- Training Data: Primarily English financial data, limited multilingual support
- Temporal Scope: Training data may not include recent financial events
- Domain Specificity: Optimized for compliance tasks, not general financial advice
Recommendations
Users should:
- Validate model outputs with domain experts
- Use appropriate guardrails for financial applications
- Regularly retrain with updated financial data
- Implement human oversight for critical decisions
How to Get Started with the Model
Basic Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load the model
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.2-1B-Instruct",
torch_dtype=torch.float16,
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "QXPS/fingpt-compliance-agents")
tokenizer = AutoTokenizer.from_pretrained("QXPS/fingpt-compliance-agents")
# Generate response
def generate_response(prompt, max_length=512):
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=max_length,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Example usage
prompt = "Analyze the sentiment of this financial news: 'Company X reported strong quarterly earnings with 15% revenue growth.'"
response = generate_response(prompt)
print(response)
Financial Q&A
# Financial Q&A example
qa_prompt = """
Question: What was the company's revenue growth in Q3 2023?
Context: The company reported Q3 2023 revenue of $2.5B, up 15% from Q3 2022 revenue of $2.17B.
Answer:
"""
response = generate_response(qa_prompt)
Sentiment Analysis
# Sentiment analysis example
sentiment_prompt = """
Classify the sentiment of this financial text as positive, negative, or neutral:
"The company's stock price plummeted 20% after missing earnings expectations."
Sentiment:
"""
response = generate_response(sentiment_prompt)
Training Details
Training Data
The model was trained on a diverse collection of financial datasets:
- FinanceBench: 150 financial Q&A examples from SEC filings
- XBRL Analysis: 574 examples of XBRL tag extraction, value extraction, and formula construction
- Financial Sentiment: 826 examples from FPB (Financial Phrase Bank) dataset
- Total Training Examples: 7,153 (5,722 train, 1,431 test)
Training Procedure
Preprocessing
- Text Processing: Standardized to conversation format with system/user/assistant roles
- Tokenization: Using Llama-3.2 tokenizer with 2048 max length
- Data Splitting: 80/20 train/test split with stratified sampling
Training Hyperparameters
- Training regime: LoRA fine-tuning with 4-bit quantization
- Base Model: meta-llama/Llama-3.2-1B-Instruct
- LoRA Parameters: r=8, alpha=16, dropout=0.1
- Batch Size: 1 with gradient accumulation of 4 steps
- Learning Rate: 1e-4 with linear warmup
- Epochs: 1 (845 training steps)
- Optimizer: AdamW
- Scheduler: Linear with warmup
Speeds, Sizes, Times
- Training Time: ~2 hours on single GPU
- Model Size: ~1.1GB (base model + LoRA weights)
- Inference Speed: ~50 tokens/second on GPU
- Memory Usage: ~4GB VRAM for inference
Evaluation
Testing Data, Factors & Metrics
Testing Data
- FinanceBench: 31 financial Q&A examples
- XBRL Analysis: 574 XBRL processing examples
- Financial Sentiment: 826 sentiment classification examples
- Audio Processing: 5 financial audio samples
Metrics
- Accuracy: Overall correctness across all tasks
- F1-Score: Harmonic mean of precision and recall
- Precision: True positives / (True positives + False positives)
- Recall: True positives / (True positives + False negatives)
Results
Financial Q&A Performance
- Accuracy: 67.7% (21/31 correct)
- Sample Size: 31 questions
Sentiment Analysis Performance
- Accuracy: 43.5% (359/826 correct)
- F1-Score: 46.7%
- Precision: 54.6%
- Recall: 43.5%
- Sample Size: 826 examples
XBRL Processing Performance
- Tag Extraction: 89.6% accuracy
- Value Extraction: 63.6% accuracy
- Formula Construction: 99.4% accuracy
- Formula Calculation: 82.2% accuracy
- Overall XBRL: 88.3% accuracy
- Sample Size: 574 examples
Overall Performance
- Accuracy: 55.6%
- F1-Score: 46.7%
- Precision: 54.6%
- Recall: 43.5%
Summary
The model shows strong performance in XBRL processing tasks (88.3% accuracy) and moderate performance in financial Q&A (67.7% accuracy). Sentiment analysis performance is lower (43.5%) but shows room for improvement with additional training data.
Model Examination
Key Strengths
- XBRL Processing: Excellent performance on structured financial data
- Formula Construction: Near-perfect accuracy (99.4%)
- Financial Q&A: Solid performance on factual questions
- Efficiency: Fast inference with 1B parameter model
Areas for Improvement
- Sentiment Analysis: Needs more diverse training data
- Complex Reasoning: Limited by model size for complex financial analysis
- Multilingual Support: Primarily English-focused
Environmental Impact
- Hardware Type: NVIDIA GPU (training), CPU/GPU (inference)
- Hours used: ~2 hours training
- Cloud Provider: Local development
- Compute Region: N/A
- Carbon Emitted: Estimated <1kg CO2
Technical Specifications
Model Architecture and Objective
- Architecture: Transformer-based causal language model
- Parameters: 1.1B (1B base + 0.1B LoRA)
- Context Length: 2048 tokens
- Vocabulary Size: 128,256 tokens
- Objective: Next token prediction with instruction following
Compute Infrastructure
Hardware
- Training: Single GPU (NVIDIA RTX 4090 or similar)
- Inference: CPU or GPU
Software
- Framework: PyTorch 2.0+
- LoRA: PEFT 0.17.1
- Transformers: 4.44.0+
- Quantization: bitsandbytes 0.41.0+
Citation
BibTeX:
@misc{fingpt-compliance-agents2025,
title={FinGPT Compliance Agents: A Specialized Language Model for Financial Compliance},
author={SecureFinAI Contest 2025 Team},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/QXPS/fingpt-compliance-agents}}
}
APA: SecureFinAI Contest 2025 Team. (2025). FinGPT Compliance Agents: A Specialized Language Model for Financial Compliance. Hugging Face. https://huggingface.co/QXPS/fingpt-compliance-agents
Glossary
- XBRL: eXtensible Business Reporting Language - XML-based standard for financial reporting
- LoRA: Low-Rank Adaptation - Parameter-efficient fine-tuning method
- SEC Filings: Securities and Exchange Commission regulatory filings
- FinanceBench: Financial question-answering benchmark dataset
- FPB: Financial Phrase Bank - sentiment analysis dataset
Model Card Authors
- Primary Authors: SecureFinAI Contest 2025 - Task 2 Team
- Contributors: FinGPT development community
- Reviewers: Financial compliance domain experts
Model Card Contact
For questions about this model:
- GitHub Issues: Repository Issues
- Hugging Face: Model Discussion
Framework versions
- PEFT 0.17.1
- Transformers 4.44.0
- PyTorch 2.0.0
- bitsandbytes 0.41.0
- Downloads last month
- 114
Model tree for xsa-dev/fingpt-compliance-agents
Base model
meta-llama/Llama-3.2-1B-Instruct