FinGPT Compliance Agents

A specialized language model for financial compliance and regulatory tasks, fine-tuned on SEC filings analysis, regulatory compliance, sentiment analysis, and XBRL data processing.

Model Details

Model Description

FinGPT Compliance Agents is a LoRA fine-tuned version of Llama-3.2-1B-Instruct, specifically designed for financial compliance and regulatory tasks. The model excels at:

SEC Filings Analysis: Extract insights from SEC filings and XBRL data processing
Financial Q&A: Answer questions about company filings and financial statements
Sentiment Analysis: Classify financial text sentiment with high accuracy
XBRL Processing: Extract tags, values, and construct formulas from XBRL data
Regulatory Compliance: Handle real-time financial data retrieval and analysis
Developed by: SecureFinAI Contest 2025 - Task 2 Team
Model type: Causal Language Model with LoRA adaptation
Language(s) (NLP): English (primary), Russian (audio processing)
License: Apache 2.0
Finetuned from model: meta-llama/Llama-3.2-1B-Instruct

Model Sources

Repository: GitHub Repository
Base Model: meta-llama/Llama-3.2-1B-Instruct
Training Data: FinanceBench, XBRL Analysis, Financial Sentiment datasets

Uses

Direct Use

This model is designed for direct use in financial compliance applications:

Financial Q&A Systems: Answer questions about company filings and financial data
Sentiment Analysis: Classify financial news, earnings calls, and market sentiment
XBRL Data Processing: Extract and analyze structured financial data
Regulatory Compliance: Process SEC filings and regulatory documents
Audio Processing: Transcribe and analyze financial audio content

Downstream Use

The model can be further fine-tuned for specific financial domains:

Banking Compliance: Anti-money laundering, fraud detection
Insurance: Risk assessment, claims processing
Investment Analysis: Portfolio management, risk evaluation
Regulatory Reporting: Automated compliance reporting

Out-of-Scope Use

This model should not be used for:

Financial advice or investment recommendations
Legal advice or regulatory interpretation
High-stakes financial decisions without human oversight
Non-financial compliance tasks

Bias, Risks, and Limitations

Known Limitations

Model Size: Limited to 1B parameters, may not capture complex financial relationships
Training Data: Primarily English financial data, limited multilingual support
Temporal Scope: Training data may not include recent financial events
Domain Specificity: Optimized for compliance tasks, not general financial advice

Recommendations

Users should:

Validate model outputs with domain experts
Use appropriate guardrails for financial applications
Regularly retrain with updated financial data
Implement human oversight for critical decisions

How to Get Started with the Model

Basic Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load the model
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.2-1B-Instruct",
    torch_dtype=torch.float16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "QXPS/fingpt-compliance-agents")
tokenizer = AutoTokenizer.from_pretrained("QXPS/fingpt-compliance-agents")

# Generate response
def generate_response(prompt, max_length=512):
    inputs = tokenizer(prompt, return_tensors="pt")
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_length,
            temperature=0.7,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
prompt = "Analyze the sentiment of this financial news: 'Company X reported strong quarterly earnings with 15% revenue growth.'"
response = generate_response(prompt)
print(response)

Financial Q&A

# Financial Q&A example
qa_prompt = """
Question: What was the company's revenue growth in Q3 2023?
Context: The company reported Q3 2023 revenue of $2.5B, up 15% from Q3 2022 revenue of $2.17B.
Answer:
"""
response = generate_response(qa_prompt)

Sentiment Analysis

# Sentiment analysis example
sentiment_prompt = """
Classify the sentiment of this financial text as positive, negative, or neutral:
"The company's stock price plummeted 20% after missing earnings expectations."
Sentiment:
"""
response = generate_response(sentiment_prompt)

Training Details

Training Data

The model was trained on a diverse collection of financial datasets:

FinanceBench: 150 financial Q&A examples from SEC filings
XBRL Analysis: 574 examples of XBRL tag extraction, value extraction, and formula construction
Financial Sentiment: 826 examples from FPB (Financial Phrase Bank) dataset
Total Training Examples: 7,153 (5,722 train, 1,431 test)

Training Procedure

Preprocessing

Text Processing: Standardized to conversation format with system/user/assistant roles
Tokenization: Using Llama-3.2 tokenizer with 2048 max length
Data Splitting: 80/20 train/test split with stratified sampling

Training Hyperparameters

Training regime: LoRA fine-tuning with 4-bit quantization
Base Model: meta-llama/Llama-3.2-1B-Instruct
LoRA Parameters: r=8, alpha=16, dropout=0.1
Batch Size: 1 with gradient accumulation of 4 steps
Learning Rate: 1e-4 with linear warmup
Epochs: 1 (845 training steps)
Optimizer: AdamW
Scheduler: Linear with warmup

Speeds, Sizes, Times

Training Time: ~2 hours on single GPU
Model Size: ~1.1GB (base model + LoRA weights)
Inference Speed: ~50 tokens/second on GPU
Memory Usage: ~4GB VRAM for inference

Evaluation

Testing Data, Factors & Metrics

Testing Data

FinanceBench: 31 financial Q&A examples
XBRL Analysis: 574 XBRL processing examples
Financial Sentiment: 826 sentiment classification examples
Audio Processing: 5 financial audio samples

Metrics

Accuracy: Overall correctness across all tasks
F1-Score: Harmonic mean of precision and recall
Precision: True positives / (True positives + False positives)
Recall: True positives / (True positives + False negatives)

Results

Financial Q&A Performance

Accuracy: 67.7% (21/31 correct)
Sample Size: 31 questions

Sentiment Analysis Performance

Accuracy: 43.5% (359/826 correct)
F1-Score: 46.7%
Precision: 54.6%
Recall: 43.5%
Sample Size: 826 examples

XBRL Processing Performance

Tag Extraction: 89.6% accuracy
Value Extraction: 63.6% accuracy
Formula Construction: 99.4% accuracy
Formula Calculation: 82.2% accuracy
Overall XBRL: 88.3% accuracy
Sample Size: 574 examples

Overall Performance

Accuracy: 55.6%
F1-Score: 46.7%
Precision: 54.6%
Recall: 43.5%

Summary

The model shows strong performance in XBRL processing tasks (88.3% accuracy) and moderate performance in financial Q&A (67.7% accuracy). Sentiment analysis performance is lower (43.5%) but shows room for improvement with additional training data.

Model Examination

Key Strengths

XBRL Processing: Excellent performance on structured financial data
Formula Construction: Near-perfect accuracy (99.4%)
Financial Q&A: Solid performance on factual questions
Efficiency: Fast inference with 1B parameter model

Areas for Improvement

Sentiment Analysis: Needs more diverse training data
Complex Reasoning: Limited by model size for complex financial analysis
Multilingual Support: Primarily English-focused

Environmental Impact

Hardware Type: NVIDIA GPU (training), CPU/GPU (inference)
Hours used: ~2 hours training
Cloud Provider: Local development
Compute Region: N/A
Carbon Emitted: Estimated <1kg CO2

Technical Specifications

Model Architecture and Objective

Architecture: Transformer-based causal language model
Parameters: 1.1B (1B base + 0.1B LoRA)
Context Length: 2048 tokens
Vocabulary Size: 128,256 tokens
Objective: Next token prediction with instruction following

Compute Infrastructure

Hardware

Training: Single GPU (NVIDIA RTX 4090 or similar)
Inference: CPU or GPU

Software

Framework: PyTorch 2.0+
LoRA: PEFT 0.17.1
Transformers: 4.44.0+
Quantization: bitsandbytes 0.41.0+

Citation

BibTeX:

@misc{fingpt-compliance-agents2025,
  title={FinGPT Compliance Agents: A Specialized Language Model for Financial Compliance},
  author={SecureFinAI Contest 2025 Team},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/QXPS/fingpt-compliance-agents}}
}

APA: SecureFinAI Contest 2025 Team. (2025). FinGPT Compliance Agents: A Specialized Language Model for Financial Compliance. Hugging Face. https://huggingface.co/QXPS/fingpt-compliance-agents

Glossary

XBRL: eXtensible Business Reporting Language - XML-based standard for financial reporting
LoRA: Low-Rank Adaptation - Parameter-efficient fine-tuning method
SEC Filings: Securities and Exchange Commission regulatory filings
FinanceBench: Financial question-answering benchmark dataset
FPB: Financial Phrase Bank - sentiment analysis dataset

Model Card Authors

Primary Authors: SecureFinAI Contest 2025 - Task 2 Team
Contributors: FinGPT development community
Reviewers: Financial compliance domain experts

Model Card Contact

For questions about this model:

GitHub Issues: Repository Issues
Hugging Face: Model Discussion

Framework versions

PEFT 0.17.1
Transformers 4.44.0
PyTorch 2.0.0
bitsandbytes 0.41.0

Downloads last month: 114

Model tree for xsa-dev/fingpt-compliance-agents

Base model

meta-llama/Llama-3.2-1B-Instruct

Adapter

(372)

this model