Ukrainian/Russian Manipulation Detector - XLM-RoBERTa

Model Description

This model detects propaganda and manipulation techniques in Ukrainian and Russian text. It is a fine-tuned version of FacebookAI/xlm-roberta-large trained on a bilingual subset of the UNLP 2025 Shared Task dataset for multi-label classification of manipulation techniques.

Its multilingual architecture makes it effective at understanding nuances in both Ukrainian and Russian, including code-mixed contexts.

Task: Manipulation Technique Classification

The model performs multi-label text classification, identifying 5 major manipulation categories. A single text can contain multiple techniques.

Manipulation Categories

  1. Loaded Language: The use of words and phrases with a strong emotional connotation (positive or negative) to influence the audience.
  2. Glittering Generalities: Exploitation of people's positive attitude towards abstract concepts such as “justice,” “freedom,” “democracy,” “patriotism,” “peace,” “happiness,” “love,” “truth,” “order,” etc. These words and phrases are intended to provoke strong emotional reactions and feelings of solidarity without providing specific information or arguments.
  3. Euphoria: Using an event that causes euphoria or a feeling of happiness, or a positive event to boost morale. This manipulation is often used to mobilize the population.
  4. Appeal to Fear: The misuse of fear (often based on stereotypes or prejudices) to support a particular proposal.
  5. FUD (Fear, Uncertainty, Doubt): Presenting information in a way that sows uncertainty and doubt, causing fear. This technique is a subtype of the appeal to fear.
  6. Bandwagon/Appeal to People: An attempt to persuade the audience to join and take action because “others are doing the same thing.”
  7. Thought-Terminating Cliché: Commonly used phrases that mitigate cognitive dissonance and block critical thinking.
  8. Whataboutism: Discrediting the opponent's position by accusing them of hypocrisy without directly refuting their arguments.
  9. Cherry Picking: Selective use of data or facts that support a hypothesis while ignoring counterarguments.
  10. Straw Man: Distorting the opponent's position by replacing it with a weaker or outwardly similar one and refuting it instead.

Training Data

The model was trained on the dataset from the UNLP 2025 Shared Task on manipulation technique classification.

Training Configuration

The model was fine-tuned using the following hyperparameters:

Parameter Value
Base Model FacebookAI/xlm-roberta-large
Learning Rate 2e-5
Train Batch Size 16
Eval Batch Size 32
Epochs 10
Max Sequence Length 512
Optimizer AdamW
Loss Function BCEWithLogitsLoss (with class weights)

Usage

Installation

First, install the necessary libraries:

pip install transformers torch sentencepiece

Quick Start

Here is how to use the model to classify a single piece of text:

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Define model and label names
model_name = "olehmell/ukr-rus-manipulation-detector-xlm-roberta" # Hypothetical model name
labels = [
    'emotional_manipulation', 
    'fear_appeals', 
    'bandwagon_effect', 
    'selective_truth', 
    'cliche'
]

# Load pretrained model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prepare text (can be Ukrainian or Russian)
text = "Все эксперты уже давно это подтвердили, только вы не понимаете, что происходит на самом деле."

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.sigmoid(outputs.logits)

# Get detected techniques
threshold = 0.5
detected_techniques = {}
for i, score in enumerate(predictions[0]):
    if score > threshold:
        detected_techniques[labels[i]] = f"{score:.2f}"

if detected_techniques:
    print("Detected techniques:")
    for technique, score in detected_techniques.items():
        print(f"- {technique} (Score: {score})")
else:
    print("No manipulation techniques detected.")

Performance

The model achieves the following performance on the evaluation set:

Metric Value
F1 Macro 0.44
F1 Micro TBD
Hamming Loss TBD

Limitations

  • Language Specificity: The model is optimized for Ukrainian and Russian. Performance on other languages is not guaranteed.
  • Domain Sensitivity: Trained primarily on political and social media discourse, its performance may vary on other text domains (e.g., scientific, literary).
  • Context Length: The model is limited to texts up to 512 tokens. Longer documents must be chunked or truncated.
  • Class Imbalance: Some manipulation techniques are underrepresented in the training data, which may affect their detection accuracy.

Ethical Considerations

  • Purpose: This model is intended as a tool to support media literacy and critical thinking, not as an arbiter of truth.
  • Human Oversight: Model outputs should be interpreted with human judgment and a full understanding of the context. It should not be used to automatically censor content.
  • Potential Biases: The model may reflect biases present in the training data.

Citation

If you use this model in your research, please cite the following:

@misc{ukrainian-russian-manipulation-xlm-roberta-2025,
  author = {Oleh Mell},
  title = {Ukrainian/Russian Manipulation Detector - XLM-RoBERTa},
  year = {2025},
  publisher = {Hugging Face},
  url = {[https://huggingface.co/olehmell/ukr-rus-manipulation-detector-xlm-roberta](https://huggingface.co/olehmell/ukr-rus-manipulation-detector-xlm-roberta)}
}
@inproceedings{unlp2025shared,
  title={UNLP 2025 Shared Task on Techniques Classification},
  author={UNLP Workshop Organizers},
  booktitle={UNLP 2025 Workshop},
  year={2025},
  url={[https://github.com/unlp-workshop/unlp-2025-shared-task](https://github.com/unlp-workshop/unlp-2025-shared-task)}
}

License

This model is licensed under the Apache 2.0 License.

Acknowledgments

  • The organizers of the UNLP 2025 Workshop for providing the dataset.
Downloads last month
51
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for olehmell/xlm-roberta-posts-manipulation-classifier

Finetuned
(742)
this model