---
license: apache-2.0
language:
- it
metrics:
- accuracy
base_model:
- DeepMount00/ModernBERT-base-ita
pipeline_tag: text-classification
---

# Text Quality Classifier (Binary)

This model aim to classify the general quality and educational content of a given text. The available labels are 'LABEL_0' that means **bad quality** and 'LABEL_1' that means **good quality**.
It can be used to efficiently filter by quality huge quantity of raw text. Useful for creating pretraining italian datasets.
The model tend to classify as "good quality" wikipedia-like texts, containing educational, well structured and explained text.

## How to get access
This is a private model, but if you want to get access explain us how you're going to use this model at <a href="mailto:redix.ai@redix.com">redix.ai@redix.com</a>


## Eval

Durante la fase di valutazione, il modello ha ottenuto le seguenti metriche:

* **Eval Loss:** 0.3422
* **Accuracy:** 0.8607
* **F1-Score:** 0.8597

## How to use

```python
from transformers import pipeline

MODEL = "ReDiX/text-quality-classifier-ita"
pipe = pipeline("text-classification", model=MODEL, tokenizer=MODEL)

example_text = "Questo è un testo di esempio in italiano per la classificazione."
result = pipe(example_text)
print(f"TEXT: '{example_text}'")
print(f"RESULT: {result}")
```

# Eval

![](confusion_matrix.png)