base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
library_name: peft
datasets:
- LawInformedAI/claudette_tos
metrics:
- accuracy
- precision
- recall
- f1
pipeline_tag: text-classification
TinyLlama-ToS-Finetuned
A LoRA-finetuned version of TinyLlama-1.1B-Chat-v1.0 for detecting unfair / anomalous Terms of Service clauses. The model classifies clauses as Fair or Unfair based on anomalous patterns in legal text.
Model Details
Model Description
- Developed by: Noshitha Padma Pratyusha Juttu (UMass Amherst, MS CS 2024โ25)
- Model type: Causal LM + LoRA adapters for classification
- Base model: TinyLlama-1.1B-Chat v1.0
- Total parameters (base + LoRA): ~1.101B
- LoRA trainable parameters: ~1.13M (โ0.1% of base model)
- Language(s): English
- License: Apache-2.0 (same as base model)
This model was finetuned with LoRA adapters. During training, only ~1.13M parameters were updated, while the 1.1B base model parameters remained frozen. The final uploaded model contains both the base weights and the adapter weights.
๐ Citation
If you use this model in your research or work, please cite the following paper:
Juttu, Noshitha Padma Pratyusha. Text to Trust: Evaluating Fine-Tuning and LoRA Trade-Offs in Language Models for Unfair Terms of Service Detection. arXiv preprint arXiv:2510.22531, 2025.
https://arxiv.org/abs/2510.22531
Model Sources
- Repository: GitHub โ UnfairTOSAgreementsDetection
Uses
Direct Use
- Clause-level classification of Terms of Service agreements.
- Detects if a clause is likely โUnfairโ or โFairโ.
Downstream Use
- Legal NLP research and experiments.
- Integrating into compliance assistants for contract review.
Out-of-Scope Use
- Not a substitute for professional legal advice.
- Not guaranteed to generalize beyond English contracts.
Bias, Risks, and Limitations
- Limited to Claudette ToS dataset โ may not represent all legal documents.
- May produce false positives/negatives, especially on borderline clauses.
- Outputs can be sensitive to prompt phrasing.
Recommendations
Use this model as assistive tool, not for automated legal decision-making.
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
base = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
adapter = "Noshitha98/TinyLlama-ToS-Finetuned"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base)
model = PeftModel.from_pretrained(model, adapter)
prompt = "<s>[CLAUSE]: You agree that we may suspend your account at any time. \n[Is this anomalous?]:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=5)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))