Upload folder using huggingface_hub

6826ebb verified 2 months ago

7.86 kB

metadata

language:
  - en
license: apache-2.0
base_model: microsoft/deberta-v3-small
tags:
  - text-classification
  - literary-analysis
  - content-moderation
  - explicitness-detection
  - deberta-v3
  - pytorch
  - focal-loss
pipeline_tag: text-classification
model-index:
  - name: deberta-v3-small-explicit-classifier-v2
    results:
      - task:
          type: text-classification
          name: Literary Explicitness Classification
        dataset:
          name: Custom Literary Dataset (Deduplicated)
          type: custom
        metrics:
          - type: accuracy
            value: 0.818
            name: Accuracy
          - type: f1
            value: 0.754
            name: Macro F1
          - type: f1
            value: 0.816
            name: Weighted F1
widget:
  - text: >-
      Content warning: This story contains mature themes including explicit
      sexual content and violence.
    example_title: Content Disclaimer
  - text: >-
      His hand lingered on hers as he helped her from the carriage, their
      fingers intertwining despite propriety.
    example_title: Suggestive Romance
  - text: >-
      She gasped as he traced kisses down her neck, his hands exploring the
      curves of her body with growing urgency.
    example_title: Explicit Sexual
  - text: >-
      The morning mist drifted across the Yorkshire moors as Elizabeth walked
      the familiar path to the village.
    example_title: Non-Explicit Literary

Literary Content Classifier - DeBERTa v3 Small (v2.0)

An improved fine-tuned DeBERTa-v3-small model for sophisticated literary content analysis across 7 categories of explicitness. This v2.0 model features significant improvements over the original, including focal loss training, extended epochs, and data quality enhancements.

🚀 Key Improvements in v2.0

+4.5% accuracy improvement (81.8% vs 77.3%)
+6.4% macro F1 improvement (0.754 vs 0.709)
+21% improvement on violent content (F1: 0.581 vs 0.478)
+19% improvement on suggestive content (F1: 0.476 vs 0.400)
Focal loss training for better minority class performance
Clean dataset with cross-split contamination resolved
Extended training (4.79 epochs vs 1.1 epochs)

Model Description

This model provides nuanced classification of textual content across 7 categories, enabling sophisticated analysis for digital humanities, content curation, and literary research applications.

ID	Category	Description	F1 Score
0	EXPLICIT-DISCLAIMER	Content warnings and age restriction notices	0.977
1	EXPLICIT-OFFENSIVE	Profanity, crude language, offensive content	0.813
2	EXPLICIT-SEXUAL	Graphic sexual content and detailed intimate scenes	0.930
3	EXPLICIT-VIOLENT	Violent or disturbing content	0.581
4	NON-EXPLICIT	Clean, family-friendly content	0.851
5	SEXUAL-REFERENCE	Mentions of sexual topics without graphic description	0.652
6	SUGGESTIVE	Mild innuendo or romantic themes without explicit detail	0.476

Performance Metrics

Overall Performance

Accuracy: 81.8%
Macro F1: 0.754
Weighted F1: 0.816

Detailed Results (Test Set)

                     precision    recall  f1-score   support
EXPLICIT-DISCLAIMER     0.95      1.00      0.98        19
EXPLICIT-OFFENSIVE      0.82      0.88      0.81       414
EXPLICIT-SEXUAL         0.93      0.91      0.93       514
EXPLICIT-VIOLENT        0.44      0.62      0.58        24
NON-EXPLICIT            0.77      0.87      0.85       683
SEXUAL-REFERENCE        0.63      0.73      0.65       212
SUGGESTIVE              0.37      0.46      0.48       134

            accuracy                        0.82      2000
           macro avg    0.65      0.78      0.75      2000
        weighted avg    0.75      0.82      0.82      2000

Training Details

Model Architecture

Base Model: microsoft/deberta-v3-small
Parameters: 141.9M (6 layers, 768 hidden, 12 attention heads)
Vocabulary: 128,100 tokens
Max Sequence Length: 512 tokens

Training Configuration

Training Method: Focal Loss (γ=2.0) for class imbalance
Epochs: 4.79 (early stopped)
Learning Rate: 5e-5 with cosine schedule
Batch Size: 16 (effective 32 with gradient accumulation)
Warmup Steps: 1,000
Weight Decay: 0.01
Early Stopping: Patience 5 on macro F1

Dataset

Total Samples: 119,023 (after deduplication)
Training: 83,316 samples
Validation: 17,853 samples
Test: 17,854 samples
Data Quality: Cross-split contamination eliminated (2,127 duplicates removed)

Training Environment

Framework: PyTorch + Transformers
Hardware: Apple Silicon (MPS)
Training Time: ~13.7 hours

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

# Load model and tokenizer
model_id = "your-username/deberta-v3-small-explicit-classifier-v2"
model = AutoModelForSequenceClassification.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

# Create classification pipeline
classifier = pipeline(
    "text-classification", 
    model=model, 
    tokenizer=tokenizer,
    return_all_scores=True,
    truncation=True
)

# Single classification
text = "His hand lingered on hers as he helped her from the carriage."
result = classifier(text)
print(f"Top prediction: {result[0]['label']} ({result[0]['score']:.3f})")

# All class probabilities
for class_result in result:
    print(f"{class_result['label']}: {class_result['score']:.3f}")

Recommended Thresholds (F1-Optimized)

For applications requiring specific precision/recall trade-offs:

Class	Optimal Threshold	Precision	Recall	F1
EXPLICIT-DISCLAIMER	0.995	0.950	1.000	0.974
EXPLICIT-OFFENSIVE	0.626	0.819	0.829	0.824
EXPLICIT-SEXUAL	0.456	0.927	0.911	0.919
EXPLICIT-VIOLENT	0.105	0.441	0.625	0.517
NON-EXPLICIT	0.103	0.768	0.874	0.818
SEXUAL-REFERENCE	0.355	0.629	0.726	0.674
SUGGESTIVE	0.530	0.370	0.455	0.408

Model Files

model.safetensors: Model weights in SafeTensors format
config.json: Model configuration with proper label mappings
tokenizer.json, spm.model: SentencePiece tokenizer files
label_mapping.json: Label ID to name mapping reference

Limitations & Considerations

Challenging Distinctions: SUGGESTIVE vs SEXUAL-REFERENCE categories remain difficult to distinguish due to conceptual overlap
Minority Classes: EXPLICIT-VIOLENT and SUGGESTIVE classes have lower F1 scores due to limited training data
Context Dependency: Short text snippets may lack sufficient context for accurate classification
Domain Specificity: Optimized for literary and review content; performance may vary on other text types
Language: English text only

Evaluation Artifacts

The model includes comprehensive evaluation materials:

Confusion matrix visualization
Per-class precision-recall curves
ROC curves for all categories
Calibration analysis
Recommended decision thresholds

Ethical Use

This model is designed for:

Academic research and digital humanities
Content curation and library science applications
Literary analysis and publishing workflows
Educational content assessment

Important: This model should be used responsibly with human oversight for content moderation decisions.

Citation

@misc{literary-explicit-classifier-v2-2025,
  title={Literary Content Analysis: Improved Multi-Class Classification with Focal Loss},
  author={Explicit Content Research Team},
  year={2025},
  note={DeBERTa-v3-small fine-tuned for literary explicitness detection}
}

License

This model is released under the Apache 2.0 license.

Mitchins
/

deberta-v3-small-literary-explicitness