EuroBERT Geopolitical Classifier (Multiclass)
Fine-tuned EuroBERT/EuroBERT-210m for detecting and categorizing geopolitical themes in (European) news text.
- Task: Sequence classification (single-label multiclass)
- Labels: 11 geopolitical topics
- Intended use: Topic categorization of news on geopolitical tensions (best performance on full article-level text)
- Languages: English, German, French, Spanish, Italian
- Framework: 🤗 Transformers (PyTorch)
Quick start
Inference with transformers
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_id = "durrani95/eurobert-geopolitical-multiclass"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
texts = [
"Russia cut off gas supplies to Europe amid rising tensions.",
"Terrorist activity has increased along the southern border.",
"New sanctions were imposed on financial institutions.",
"Talks at the UN Security Council failed to reach consensus.",
"Tarrifs on soybeans are applied to pressure China into a deal with the US" ,
"Tom and Jerry have a fight! The mouse finally had enough.",
]
inputs = tokenizer(texts, padding=True, truncation=True, max_length=512, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=1)
for text, p in zip(texts, probs):
label_id = int(p.argmax())
label = model.config.id2label[label_id]
confidence = float(p[label_id])
print(f"{label:>28} {confidence:6.2%} | {text}")
Category Definitions
| Category | Description | Example |
|---|---|---|
| war_military_conflict | Armed conflicts, military operations, or war-related issues involving states or armed groups. | Russia’s invasion of Ukraine |
| terrorism_insurgency | Terrorist attacks, counter-terrorism operations, or insurgent activity. | 9/11 attacks |
| cyber_warfare | Cyberattacks or hacking by foreign states or international actors with strategic motives. | North Korea’s Sony hack |
| trade_disputes | Tensions between states over trade policy, tariffs, or retaliation. | U.S.–China trade wars |
| financial_sanctions | Economic penalties imposed by countries against targeted states, entities, or individuals. | U.S. sanctions on Iran’s banking sector |
| regional_disintegration | Political developments that threaten the cohesion of existing regional entities. | Brexit |
| energy_resource_conflicts | Disputes over energy access, distribution, or natural resource control. | OPEC oil disputes |
| global_governance | Tensions involving international institutions or multilateral diplomacy. | NATO expansion |
| nuclear_proliferation | Issues concerning the spread or control of nuclear weapons. | Iran nuclear deal |
| territorial_disputes | Conflicts over land or maritime boundaries. | South China Sea tensions |
| non_geopol | Texts without geopolitical relevance. | Domestic politics or economic updates |
Training & Configuration
- Base model:
EuroBERT/EuroBERT-210m - Objective: Cross-entropy (single-label multiclass)
- Number of labels: 11
- Data: European news text labeled across geopolitical topics
- Hardware: A100 GPU
- Epochs: 1
- Optimizer: AdamW with linear scheduler
Training setup
| Parameter | Value |
|---|---|
| Learning rate | 3e-5 |
| Desired (effective) batch size | 64 |
| Actual GPU batch size | 16 |
| Gradient accumulation | 4 steps |
| Weight decay | 1e-5 |
| Betas | (0.9, 0.95) |
| Epsilon | 1e-8 |
| Max epochs | 1 |
Limitations & Risks
- May be sensitive to domain shift (non-news, social media text)
- The model predicts one dominant label per text; it is not multi-label.
- Multilingual performance can vary across languages and registers
How to cite
If you use this model, please cite this repository and the EuroBERT base model.
- Downloads last month
- 86
Model tree for Durrani95/eurobert-geopolitical-multiclass
Base model
EuroBERT/EuroBERT-210m