Mitchins's picture
Upload folder using huggingface_hub
6826ebb verified
metadata
language:
  - en
license: apache-2.0
base_model: microsoft/deberta-v3-small
tags:
  - text-classification
  - literary-analysis
  - content-moderation
  - explicitness-detection
  - deberta-v3
  - pytorch
  - focal-loss
pipeline_tag: text-classification
model-index:
  - name: deberta-v3-small-explicit-classifier-v2
    results:
      - task:
          type: text-classification
          name: Literary Explicitness Classification
        dataset:
          name: Custom Literary Dataset (Deduplicated)
          type: custom
        metrics:
          - type: accuracy
            value: 0.818
            name: Accuracy
          - type: f1
            value: 0.754
            name: Macro F1
          - type: f1
            value: 0.816
            name: Weighted F1
widget:
  - text: >-
      Content warning: This story contains mature themes including explicit
      sexual content and violence.
    example_title: Content Disclaimer
  - text: >-
      His hand lingered on hers as he helped her from the carriage, their
      fingers intertwining despite propriety.
    example_title: Suggestive Romance
  - text: >-
      She gasped as he traced kisses down her neck, his hands exploring the
      curves of her body with growing urgency.
    example_title: Explicit Sexual
  - text: >-
      The morning mist drifted across the Yorkshire moors as Elizabeth walked
      the familiar path to the village.
    example_title: Non-Explicit Literary

Literary Content Classifier - DeBERTa v3 Small (v2.0)

An improved fine-tuned DeBERTa-v3-small model for sophisticated literary content analysis across 7 categories of explicitness. This v2.0 model features significant improvements over the original, including focal loss training, extended epochs, and data quality enhancements.

๐Ÿš€ Key Improvements in v2.0

  • +4.5% accuracy improvement (81.8% vs 77.3%)
  • +6.4% macro F1 improvement (0.754 vs 0.709)
  • +21% improvement on violent content (F1: 0.581 vs 0.478)
  • +19% improvement on suggestive content (F1: 0.476 vs 0.400)
  • Focal loss training for better minority class performance
  • Clean dataset with cross-split contamination resolved
  • Extended training (4.79 epochs vs 1.1 epochs)

Model Description

This model provides nuanced classification of textual content across 7 categories, enabling sophisticated analysis for digital humanities, content curation, and literary research applications.

Categories

ID Category Description F1 Score
0 EXPLICIT-DISCLAIMER Content warnings and age restriction notices 0.977
1 EXPLICIT-OFFENSIVE Profanity, crude language, offensive content 0.813
2 EXPLICIT-SEXUAL Graphic sexual content and detailed intimate scenes 0.930
3 EXPLICIT-VIOLENT Violent or disturbing content 0.581
4 NON-EXPLICIT Clean, family-friendly content 0.851
5 SEXUAL-REFERENCE Mentions of sexual topics without graphic description 0.652
6 SUGGESTIVE Mild innuendo or romantic themes without explicit detail 0.476

Performance Metrics

Overall Performance

  • Accuracy: 81.8%
  • Macro F1: 0.754
  • Weighted F1: 0.816

Detailed Results (Test Set)

                     precision    recall  f1-score   support
EXPLICIT-DISCLAIMER     0.95      1.00      0.98        19
EXPLICIT-OFFENSIVE      0.82      0.88      0.81       414
EXPLICIT-SEXUAL         0.93      0.91      0.93       514
EXPLICIT-VIOLENT        0.44      0.62      0.58        24
NON-EXPLICIT            0.77      0.87      0.85       683
SEXUAL-REFERENCE        0.63      0.73      0.65       212
SUGGESTIVE              0.37      0.46      0.48       134

            accuracy                        0.82      2000
           macro avg    0.65      0.78      0.75      2000
        weighted avg    0.75      0.82      0.82      2000

Training Details

Model Architecture

  • Base Model: microsoft/deberta-v3-small
  • Parameters: 141.9M (6 layers, 768 hidden, 12 attention heads)
  • Vocabulary: 128,100 tokens
  • Max Sequence Length: 512 tokens

Training Configuration

  • Training Method: Focal Loss (ฮณ=2.0) for class imbalance
  • Epochs: 4.79 (early stopped)
  • Learning Rate: 5e-5 with cosine schedule
  • Batch Size: 16 (effective 32 with gradient accumulation)
  • Warmup Steps: 1,000
  • Weight Decay: 0.01
  • Early Stopping: Patience 5 on macro F1

Dataset

  • Total Samples: 119,023 (after deduplication)
  • Training: 83,316 samples
  • Validation: 17,853 samples
  • Test: 17,854 samples
  • Data Quality: Cross-split contamination eliminated (2,127 duplicates removed)

Training Environment

  • Framework: PyTorch + Transformers
  • Hardware: Apple Silicon (MPS)
  • Training Time: ~13.7 hours

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

# Load model and tokenizer
model_id = "your-username/deberta-v3-small-explicit-classifier-v2"
model = AutoModelForSequenceClassification.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

# Create classification pipeline
classifier = pipeline(
    "text-classification", 
    model=model, 
    tokenizer=tokenizer,
    return_all_scores=True,
    truncation=True
)

# Single classification
text = "His hand lingered on hers as he helped her from the carriage."
result = classifier(text)
print(f"Top prediction: {result[0]['label']} ({result[0]['score']:.3f})")

# All class probabilities
for class_result in result:
    print(f"{class_result['label']}: {class_result['score']:.3f}")

Recommended Thresholds (F1-Optimized)

For applications requiring specific precision/recall trade-offs:

Class Optimal Threshold Precision Recall F1
EXPLICIT-DISCLAIMER 0.995 0.950 1.000 0.974
EXPLICIT-OFFENSIVE 0.626 0.819 0.829 0.824
EXPLICIT-SEXUAL 0.456 0.927 0.911 0.919
EXPLICIT-VIOLENT 0.105 0.441 0.625 0.517
NON-EXPLICIT 0.103 0.768 0.874 0.818
SEXUAL-REFERENCE 0.355 0.629 0.726 0.674
SUGGESTIVE 0.530 0.370 0.455 0.408

Model Files

  • model.safetensors: Model weights in SafeTensors format
  • config.json: Model configuration with proper label mappings
  • tokenizer.json, spm.model: SentencePiece tokenizer files
  • label_mapping.json: Label ID to name mapping reference

Limitations & Considerations

  1. Challenging Distinctions: SUGGESTIVE vs SEXUAL-REFERENCE categories remain difficult to distinguish due to conceptual overlap
  2. Minority Classes: EXPLICIT-VIOLENT and SUGGESTIVE classes have lower F1 scores due to limited training data
  3. Context Dependency: Short text snippets may lack sufficient context for accurate classification
  4. Domain Specificity: Optimized for literary and review content; performance may vary on other text types
  5. Language: English text only

Evaluation Artifacts

The model includes comprehensive evaluation materials:

  • Confusion matrix visualization
  • Per-class precision-recall curves
  • ROC curves for all categories
  • Calibration analysis
  • Recommended decision thresholds

Ethical Use

This model is designed for:

  • Academic research and digital humanities
  • Content curation and library science applications
  • Literary analysis and publishing workflows
  • Educational content assessment

Important: This model should be used responsibly with human oversight for content moderation decisions.

Citation

@misc{literary-explicit-classifier-v2-2025,
  title={Literary Content Analysis: Improved Multi-Class Classification with Focal Loss},
  author={Explicit Content Research Team},
  year={2025},
  note={DeBERTa-v3-small fine-tuned for literary explicitness detection}
}

License

This model is released under the Apache 2.0 license.