EuroBERT Geopolitical Classifier (Multiclass)

Fine-tuned EuroBERT/EuroBERT-210m for detecting and categorizing geopolitical themes in (European) news text.

Task: Sequence classification (single-label multiclass)
Labels: 11 geopolitical topics
Intended use: Topic categorization of news on geopolitical tensions (best performance on full article-level text)
Languages: English, German, French, Spanish, Italian
Framework: 🤗 Transformers (PyTorch)

Quick start

Inference with `transformers`

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "durrani95/eurobert-geopolitical-multiclass"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)


texts = [
    "Russia cut off gas supplies to Europe amid rising tensions.",
    "Terrorist activity has increased along the southern border.",
    "New sanctions were imposed on financial institutions.",
    "Talks at the UN Security Council failed to reach consensus.",
    "Tarrifs on soybeans are applied to pressure China into a deal with the US" ,
    "Tom and Jerry have a fight! The mouse finally had enough.",
]

inputs = tokenizer(texts, padding=True, truncation=True, max_length=512, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits
    probs = torch.softmax(logits, dim=1)

for text, p in zip(texts, probs):
    label_id = int(p.argmax())
    label = model.config.id2label[label_id]
    confidence = float(p[label_id])
    print(f"{label:>28}  {confidence:6.2%}  | {text}")

Category Definitions

Category	Description	Example
war_military_conflict	Armed conflicts, military operations, or war-related issues involving states or armed groups.	Russia’s invasion of Ukraine
terrorism_insurgency	Terrorist attacks, counter-terrorism operations, or insurgent activity.	9/11 attacks
cyber_warfare	Cyberattacks or hacking by foreign states or international actors with strategic motives.	North Korea’s Sony hack
trade_disputes	Tensions between states over trade policy, tariffs, or retaliation.	U.S.–China trade wars
financial_sanctions	Economic penalties imposed by countries against targeted states, entities, or individuals.	U.S. sanctions on Iran’s banking sector
regional_disintegration	Political developments that threaten the cohesion of existing regional entities.	Brexit
energy_resource_conflicts	Disputes over energy access, distribution, or natural resource control.	OPEC oil disputes
global_governance	Tensions involving international institutions or multilateral diplomacy.	NATO expansion
nuclear_proliferation	Issues concerning the spread or control of nuclear weapons.	Iran nuclear deal
territorial_disputes	Conflicts over land or maritime boundaries.	South China Sea tensions
non_geopol	Texts without geopolitical relevance.	Domestic politics or economic updates

Training & Configuration

Base model: EuroBERT/EuroBERT-210m
Objective: Cross-entropy (single-label multiclass)
Number of labels: 11
Data: European news text labeled across geopolitical topics
Hardware: A100 GPU
Epochs: 1
Optimizer: AdamW with linear scheduler

Training setup

Parameter	Value
Learning rate	3e-5
Desired (effective) batch size	64
Actual GPU batch size	16
Gradient accumulation	4 steps
Weight decay	1e-5
Betas	(0.9, 0.95)
Epsilon	1e-8
Max epochs	1

Limitations & Risks

May be sensitive to domain shift (non-news, social media text)
The model predicts one dominant label per text; it is not multi-label.
Multilingual performance can vary across languages and registers

How to cite

If you use this model, please cite this repository and the EuroBERT base model.

Downloads last month: 1

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for Durrani95/eurobert-geopolitical-multiclass

Base model

EuroBERT/EuroBERT-210m

Finetuned

(53)

this model