Noshitha98
/

TinyLlama-ToS-Finetuned

Text Classification

Model card Files Files and versions

TinyLlama-ToS-Finetuned / README.md

Noshitha98's picture

Update README.md

8539588 verified about 1 month ago

|

3.04 kB

	---
	base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
	library_name: peft
	datasets:
	- LawInformedAI/claudette_tos
	metrics:
	- accuracy
	- precision
	- recall
	- f1
	pipeline_tag: text-classification
	---

	# TinyLlama-ToS-Finetuned

	A LoRA-finetuned version of TinyLlama-1.1B-Chat-v1.0 for detecting unfair / anomalous Terms of Service clauses. The model classifies clauses as Fair or Unfair based on anomalous patterns in legal text.

	---

	## Model Details

	### Model Description
	- Developed by: Noshitha Padma Pratyusha Juttu (UMass Amherst, MS CS 2024–25)
	- Model type: Causal LM + LoRA adapters for classification
	- Base model: [TinyLlama-1.1B-Chat v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0)
	- Total parameters (base + LoRA): ~1.101B
	- LoRA trainable parameters: ~1.13M (≈0.1% of base model)
	- Language(s): English
	- License: Apache-2.0 (same as base model)

	This model was finetuned with LoRA adapters. During training, only ~1.13M parameters were updated, while the 1.1B base model parameters remained frozen. The final uploaded model contains both the base weights and the adapter weights.

	## 📚 Citation

	If you use this model in your research or work, please cite the following paper:

	> Juttu, Noshitha Padma Pratyusha. Text to Trust: Evaluating Fine-Tuning and LoRA Trade-Offs in Language Models for Unfair Terms of Service Detection. arXiv preprint arXiv:2510.22531, 2025.
	https://arxiv.org/abs/2510.22531


	### Model Sources
	- Repository: [GitHub – UnfairTOSAgreementsDetection](https://github.com/Stimils02/UnfairTOSAgreementsDetection)

	---

	## Uses

	### Direct Use
	- Clause-level classification of Terms of Service agreements.
	- Detects if a clause is likely “Unfair” or “Fair”.

	### Downstream Use
	- Legal NLP research and experiments.
	- Integrating into compliance assistants for contract review.

	### Out-of-Scope Use
	- Not a substitute for professional legal advice.
	- Not guaranteed to generalize beyond English contracts.

	---

	## Bias, Risks, and Limitations
	- Limited to Claudette ToS dataset → may not represent all legal documents.
	- May produce false positives/negatives, especially on borderline clauses.
	- Outputs can be sensitive to prompt phrasing.

	### Recommendations
	Use this model as assistive tool, not for automated legal decision-making.

	---

	## How to Get Started with the Model

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel

	base = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
	adapter = "Noshitha98/TinyLlama-ToS-Finetuned"

	tokenizer = AutoTokenizer.from_pretrained(base)
	model = AutoModelForCausalLM.from_pretrained(base)
	model = PeftModel.from_pretrained(model, adapter)

	prompt = "<s>[CLAUSE]: You agree that we may suspend your account at any time. \n[Is this anomalous?]:"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=5)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))