Ukrainian/Russian Manipulation Detector - XLM-RoBERTa
Model Description
This model detects propaganda and manipulation techniques in Ukrainian and Russian text. It is a fine-tuned version of FacebookAI/xlm-roberta-large trained on a bilingual subset of the UNLP 2025 Shared Task dataset for multi-label classification of manipulation techniques.
Its multilingual architecture makes it effective at understanding nuances in both Ukrainian and Russian, including code-mixed contexts.
Task: Manipulation Technique Classification
The model performs multi-label text classification, identifying 5 major manipulation categories. A single text can contain multiple techniques.
Manipulation Categories
- Loaded Language: The use of words and phrases with a strong emotional connotation (positive or negative) to influence the audience.
- Glittering Generalities: Exploitation of people's positive attitude towards abstract concepts such as “justice,” “freedom,” “democracy,” “patriotism,” “peace,” “happiness,” “love,” “truth,” “order,” etc. These words and phrases are intended to provoke strong emotional reactions and feelings of solidarity without providing specific information or arguments.
- Euphoria: Using an event that causes euphoria or a feeling of happiness, or a positive event to boost morale. This manipulation is often used to mobilize the population.
- Appeal to Fear: The misuse of fear (often based on stereotypes or prejudices) to support a particular proposal.
- FUD (Fear, Uncertainty, Doubt): Presenting information in a way that sows uncertainty and doubt, causing fear. This technique is a subtype of the appeal to fear.
- Bandwagon/Appeal to People: An attempt to persuade the audience to join and take action because “others are doing the same thing.”
- Thought-Terminating Cliché: Commonly used phrases that mitigate cognitive dissonance and block critical thinking.
- Whataboutism: Discrediting the opponent's position by accusing them of hypocrisy without directly refuting their arguments.
- Cherry Picking: Selective use of data or facts that support a hypothesis while ignoring counterarguments.
- Straw Man: Distorting the opponent's position by replacing it with a weaker or outwardly similar one and refuting it instead.
Training Data
The model was trained on the dataset from the UNLP 2025 Shared Task on manipulation technique classification.
- Dataset: UNLP 2025 Techniques Classification
- Source Texts: Ukrainian and Russian texts from a larger multilingual dataset.
- Task: Multi-label classification.
Training Configuration
The model was fine-tuned using the following hyperparameters:
| Parameter | Value |
|---|---|
| Base Model | FacebookAI/xlm-roberta-large |
| Learning Rate | 2e-5 |
| Train Batch Size | 16 |
| Eval Batch Size | 32 |
| Epochs | 10 |
| Max Sequence Length | 512 |
| Optimizer | AdamW |
| Loss Function | BCEWithLogitsLoss (with class weights) |
Usage
Installation
First, install the necessary libraries:
pip install transformers torch sentencepiece
Quick Start
Here is how to use the model to classify a single piece of text:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
# Define model and label names
model_name = "olehmell/ukr-rus-manipulation-detector-xlm-roberta" # Hypothetical model name
labels = [
'emotional_manipulation',
'fear_appeals',
'bandwagon_effect',
'selective_truth',
'cliche'
]
# Load pretrained model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Prepare text (can be Ukrainian or Russian)
text = "Все эксперты уже давно это подтвердили, только вы не понимаете, что происходит на самом деле."
# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.sigmoid(outputs.logits)
# Get detected techniques
threshold = 0.5
detected_techniques = {}
for i, score in enumerate(predictions[0]):
if score > threshold:
detected_techniques[labels[i]] = f"{score:.2f}"
if detected_techniques:
print("Detected techniques:")
for technique, score in detected_techniques.items():
print(f"- {technique} (Score: {score})")
else:
print("No manipulation techniques detected.")
Performance
The model achieves the following performance on the evaluation set:
| Metric | Value |
|---|---|
| F1 Macro | 0.44 |
| F1 Micro | TBD |
| Hamming Loss | TBD |
Limitations
- Language Specificity: The model is optimized for Ukrainian and Russian. Performance on other languages is not guaranteed.
- Domain Sensitivity: Trained primarily on political and social media discourse, its performance may vary on other text domains (e.g., scientific, literary).
- Context Length: The model is limited to texts up to 512 tokens. Longer documents must be chunked or truncated.
- Class Imbalance: Some manipulation techniques are underrepresented in the training data, which may affect their detection accuracy.
Ethical Considerations
- Purpose: This model is intended as a tool to support media literacy and critical thinking, not as an arbiter of truth.
- Human Oversight: Model outputs should be interpreted with human judgment and a full understanding of the context. It should not be used to automatically censor content.
- Potential Biases: The model may reflect biases present in the training data.
Citation
If you use this model in your research, please cite the following:
@misc{ukrainian-russian-manipulation-xlm-roberta-2025,
author = {Oleh Mell},
title = {Ukrainian/Russian Manipulation Detector - XLM-RoBERTa},
year = {2025},
publisher = {Hugging Face},
url = {[https://huggingface.co/olehmell/ukr-rus-manipulation-detector-xlm-roberta](https://huggingface.co/olehmell/ukr-rus-manipulation-detector-xlm-roberta)}
}
@inproceedings{unlp2025shared,
title={UNLP 2025 Shared Task on Techniques Classification},
author={UNLP Workshop Organizers},
booktitle={UNLP 2025 Workshop},
year={2025},
url={[https://github.com/unlp-workshop/unlp-2025-shared-task](https://github.com/unlp-workshop/unlp-2025-shared-task)}
}
License
This model is licensed under the Apache 2.0 License.
Acknowledgments
- The organizers of the UNLP 2025 Workshop for providing the dataset.
- Downloads last month
- 51
Model tree for olehmell/xlm-roberta-posts-manipulation-classifier
Base model
FacebookAI/xlm-roberta-large