--- license: apache-2.0 language: - it metrics: - accuracy base_model: - DeepMount00/ModernBERT-base-ita pipeline_tag: text-classification --- # Text Quality Classifier (Binary) This model aim to classify the general quality and educational content of a given text. The available labels are 'LABEL_0' that means **bad quality** and 'LABEL_1' that means **good quality**. It can be used to efficiently filter by quality huge quantity of raw text. Useful for creating pretraining italian datasets. The model tend to classify as "good quality" wikipedia-like texts, containing educational, well structured and explained text. ## How to get access This is a private model, but if you want to get access explain us how you're going to use this model at redix.ai@redix.com ## Eval Durante la fase di valutazione, il modello ha ottenuto le seguenti metriche: * **Eval Loss:** 0.3422 * **Accuracy:** 0.8607 * **F1-Score:** 0.8597 ## How to use ```python from transformers import pipeline MODEL = "ReDiX/text-quality-classifier-ita" pipe = pipeline("text-classification", model=MODEL, tokenizer=MODEL) example_text = "Questo รจ un testo di esempio in italiano per la classificazione." result = pipe(example_text) print(f"TEXT: '{example_text}'") print(f"RESULT: {result}") ``` # Eval ![](confusion_matrix.png)