File size: 3,583 Bytes
b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af 8825321 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af b544c5e 2bde2af |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
---
license: mit
language:
- en
base_model:
- jhu-clsp/ettin-encoder-32m
pipeline_tag: token-classification
tags:
- token classification
- hallucination detection
- retrieval-augmented generation
- transformers
- ettin
- lightweight
datasets:
- enelpol/rag-mini-bioasq
library_name: transformers
---
# TinyLettuce (Ettin-32M): Efficient Hallucination Detection
<p align="center">
<img src="https://github.com/KRLabsOrg/LettuceDetect/blob/dev/assets/tinytinylettuce.png?raw=true" alt="TinyLettuce" width="400"/>
</p>
**Model Name:** tinylettuce-ettin-32m-en-v1
**Organization:** KRLabsOrg
**Github:** https://github.com/KRLabsOrg/LettuceDetect
**Ettin encoders:** https://arxiv.org/pdf/2507.11412
## Overview
TinyLettuce is a token‑classification model that flags unsupported spans in answers given context. The 32M Ettin variant balances accuracy and CPU‑side efficiency; it’s designed for low‑cost domain fine‑tuning on synthetic data.
Trained on our synthetic dataset (mixed with RAGTruth), this 32M variant achieves 88.76% F1 on the held‑out synthetic test set (beating large-scale LLM judges like GPT-OSS-120b), proving the effectiveness of our domain‑specific hallucination data generation pipeline.
## Model Details
- Architecture: Ettin encoder (32M) + token‑classification head
- Task: token classification (0 = supported, 1 = hallucinated)
- Input: [CLS] context [SEP] question [SEP] answer [SEP], up to 4096 tokens
- Language: English; License: MIT
## Training Data
- Synthetic (train): ~1,500 hallucinated samples (≈3,000 with non‑hallucinated) from enelpol/rag-mini-bioasq; intensity 0.3.
- Synthetic (test): 300 hallucinated samples (≈600 total) held out.
## Training Procedure
- Tokenizer: AutoTokenizer; DataCollatorForTokenClassification; label pad −100
- Max length: 4096; batch size: 8; epochs: 3
- Optimizer: AdamW (lr 1e‑5, weight_decay 0.01)
- Hardware: Single A100 80GB
## Results
Synthetic (domain‑specific):
| Model | Parameters | Precision (%) | Recall (%) | F1 (%) | Hardware |
|-------|------------|---------------|------------|--------|----------|
| TinyLettuce-17M | 17M | 84.56 | 98.21 | 90.87 | CPU |
| **TinyLettuce-32M** | 32M | 80.36 | 99.10 | 88.76 | CPU |
| TinyLettuce-68M | 68M | 89.54 | 95.96 | 92.64 | CPU |
| GPT-5-mini | ~200B | 71.95 | 100.00 | 83.69 | API/GPU |
| GPT-OSS-120B | 120B | 72.21 | 98.64 | 83.38 | GPU |
| Qwen3-235B | 235B | 66.74 | 99.32 | 79.84 | GPU |
## Usage
First install lettucedetect:
```bash
pip install lettucedetect
```
Then use it:
```python
from lettucedetect.models.inference import HallucinationDetector
detector = HallucinationDetector(
method="transformer",
model_path="KRLabsOrg/tinylettuce-ettin-32m-en-v1",
)
spans = detector.predict(
context=[
"Ibuprofen is an NSAID that reduces inflammation and pain. The typical adult dose is 400-600mg every 6-8 hours, not exceeding 2400mg daily."
],
question="What is the maximum daily dose of ibuprofen?",
answer="The maximum daily dose of ibuprofen for adults is 3200mg.",
output_format="spans",
)
print(spans)
# Output: [{"start": 51, "end": 57, "text": "3200mg"}]
```
## Citing
If you use the model or the tool, please cite the following paper:
```bibtex
@misc{Kovacs:2025,
title={LettuceDetect: A Hallucination Detection Framework for RAG Applications},
author={Ádám Kovács and Gábor Recski},
year={2025},
eprint={2502.17125},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.17125},
}
``` |