๐Ÿง  LexGLUEโ€“LEDGAR DistilBERT Legal Clause Classifier

Author: Salman Abbasi
Affiliation: Beijing Institute of Technology, China
Research Area: NLP ร— Agentic AI ร— Prompt Engineering
Date: October 2025


๐Ÿ“ Model Overview

This model is a DistilBERT-based legal clause classifier fine-tuned on the LexGLUEโ€“LEDGAR dataset.
It predicts the type of contract clause (e.g., Confidentiality, Termination, Liability, Payment, etc.) given a legal paragraph or clause.

  • Base Model: distilbert-base-uncased
  • Dataset: coastalcph/lex_glue (ledgar configuration)
  • Task Type: Multi-class text classification
  • Language: English
  • Domain: Legal / Contractual Texts

This model achieves strong performance despite its small size, making it efficient for deployment in legal NLP systems and agentic AI applications.


๐Ÿš€ Model Performance

Metric Score
Accuracy 0.8618
Macro F1 0.7842
Validation Loss 0.5756
Epochs 3
Batch Size 8
Learning Rate 2e-5
Max Sequence Length 512

๐Ÿ“š Dataset โ€” LexGLUE LEDGAR

The LEDGAR dataset contains contract clauses annotated with one of 100+ provision types.
It is part of the LexGLUE benchmark, which evaluates legal text understanding models.

Each sample includes:

{
  "text": "This clause sets out the obligations of each party regarding confidentiality.",
  "label": "Confidentiality"
}
Downloads last month
6
Safetensors
Model size
67M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Datasets used to train SalmanAbbasi/lexglue-ledgar-distilbert