Image-Guard-2.0: A SigLIP 2 Based Image Safety Classification Model
[1.] Introduction
Image-Guard-2.0 is an experimental, lightweight vision-language encoder model with a size of 0.1B (<100M parameters), trained on the SigLIP2 (siglip2-base-patch16-224). Designed for multi-label image classification tasks, this model functions as an image safety system, acting as an image guard or moderator across a wide range of categories, from anime to realistic imagery. It also performs strict moderation and filtering for artificially synthesized content, particularly demonstrating strong detection and handling of explicit images. Image-Guard-2.0 exhibits robust performance in streamlined scenarios, ensuring reliable and effective classification across diverse visual inputs.
[2.] Image-Guard 2.0 and Its Dataset Composition Sources
Image-Guard 2.0 is trained on a diverse collection of datasets. For Anime-SFW (Safe-for-Work) images, the dataset includes curated safe entries from anime series, manga, and various art works available on the web. For Normal-SFW (Safe-for-Work) images, the collection encompasses safe realism, portraits, black-and-white images, realistic sketches, AI-generated realistic images, doodles, paintings, and more. Explicit content, including Hentai, Enticing or Sensual, and Pornography, is carefully sourced from publicly available datasets on Hugging Face Hub, Kaggle, and other open-access repositories. These combined datasets provide a comprehensive foundation for training Image-Guard 2.0 to effectively classify and moderate diverse image types.
[3.] What Sets Image-Guard 2.0 Apart
The most downloaded open-source model, FalconsAI/nsfw_image_detection, provides a solid foundation for NSFW image classification but is less moderate toward publicly sensual images, portraits, and a wide range of image categories. FalconsAI’s model, built on the Vision Transformer (ViT) architecture, is fine-tuned on a proprietary dataset of approximately 80,000 images and classifies them simply into “normal” and “nsfw” categories, making it effective for general content moderation. In contrast, Image-Guard-2.0 distinguishes itself through its experimental vision-language encoder design based on SigLIP 2 and multi-label classification capabilities. Trained on a diverse and extensive dataset spanning anime, realism, synthetic, and AI-generated images, it allows for nuanced moderation across multiple content types. Its architecture and training methodology are optimized for lightweight deployment, offering efficient performance without compromising accuracy, making Image-Guard 2.0 particularly adept for real-time content moderation across complex and diverse image scenarios.
[4.] Sample Inferences for Image-Guard-2.0
The following set of images showcases examples of demo inferences performed by Image-Guard-2.0. These images highlight the model’s ability to classify and moderate content across various categories
[5.] Fine-tune SigLIP2 (Domain-Specific Downstream Tasks)
The script below is used for fine-tuning SigLIP2 foundational models on single- or multi-label image classification tasks.
Install the packages
%%capture
!pip install evaluate datasets accelerate
!pip install transformers==4.50.0 torchvision
!pip install huggingface-hub hf_xet
#Hold tight, this will take around 2-3 minutes.
Code (run_finetune_siglip2.py)
# Fine-Tuning SigLIP2 for Image Classification | Script prepared by: hf.co/prithivMLmods
#
# Dataset with Train & Test Splits
#
# In this configuration, the dataset is already organized into separate training and testing splits. This setup is ideal for straightforward supervised learning workflows.
#
# Training Phase:
# The model is fine-tuned exclusively on the train split, where each image is paired with its corresponding class label.
#
# Evaluation Phase:
# After training, the model's performance is assessed on the test split to measure generalization accuracy.
# 1. Install the packages
# %%capture
# Install required libraries for fine-tuning SigLIP 2
# 'evaluate' - for computing evaluation metrics like accuracy
# 'datasets' - for handling train/test splits and loading image datasets
# 'accelerate' - for efficient multi-GPU / multi-CPU training
# 'transformers==4.50.0' - for SigLIP 2 and other transformer-based models
# 'torchvision' - for image preprocessing and augmentations
# 'huggingface-hub' - for model and dataset uploads/downloads
# 'hf_xet' - optional helper for versioned dataset/model storage on Hugging Face Hub
# Hold tight, this will take around 2-3 minutes.
# If you are performing the training process outside of Google Colaboratory, install: imbalanced-learn
# --------------------------------------------------------------------------
# To demonstrate the fine-tuning process, we will use the MNIST dataset — a classic benchmark for image classification.
# MNIST consists of 28x28 grayscale images of handwritten digits (0–9), making it ideal for testing model training pipelines.
# We will load the dataset directly from the Hugging Face Hub using the 'datasets' library.
# Dataset link: https://huggingface.co/datasets/ylecun/mnist
# --------------------------------------------------------------------------
# 2. Import modules required for data manipulation, model training, and image preprocessing.
import warnings
warnings.filterwarnings("ignore")
import gc
import numpy as np
import pandas as pd
import itertools
from collections import Counter
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score, roc_auc_score, confusion_matrix, classification_report, f1_score
from imblearn.over_sampling import RandomOverSampler
import evaluate
from datasets import Dataset, Image, ClassLabel
from transformers import (
TrainingArguments,
Trainer,
DefaultDataCollator
)
from transformers import AutoImageProcessor
from transformers import SiglipForImageClassification
from transformers.image_utils import load_image
import torch
from torch.utils.data import DataLoader
from torchvision.transforms import (
CenterCrop,
Compose,
Normalize,
RandomRotation,
RandomResizedCrop,
RandomHorizontalFlip,
RandomAdjustSharpness,
Resize,
ToTensor
)
from PIL import Image, ExifTags
from PIL import Image as PILImage
from PIL import ImageFile
# Enable loading truncated images
ImageFile.LOAD_TRUNCATED_IMAGES = True
# 3. Loading and Preparing the Dataset
from datasets import load_dataset
dataset = load_dataset("ylecun/mnist", split="train")
from pathlib import Path
file_names = []
labels = []
for example in dataset:
file_path = str(example['image'])
label = example['label']
file_names.append(file_path)
labels.append(label)
print(len(file_names), len(labels))
# 4. Creating a DataFrame and Balancing the Dataset & Working with a Subset of Labels
df = pd.DataFrame.from_dict({"image": file_names, "label": labels})
print(df.shape)
df.head()
df['label'].unique()
y = df[['label']]
df = df.drop(['label'], axis=1)
ros = RandomOverSampler(random_state=83)
df, y_resampled = ros.fit_resample(df, y)
del y
df['label'] = y_resampled
del y_resampled
gc.collect()
labels_subset = labels[:5]
print(labels_subset)
#labels_list = ['example_label_0', 'example_label_1'................,'example_label_n-1']
labels_list = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
label2id, id2label = {}, {}
for i, label in enumerate(labels_list):
label2id[label] = i
id2label[i] = label
ClassLabels = ClassLabel(num_classes=len(labels_list), names=labels_list)
print("Mapping of IDs to Labels:", id2label, '\n')
print("Mapping of Labels to IDs:", label2id)
# 5. Mapping and Casting Labels
def map_label2id(example):
example['label'] = ClassLabels.str2int(example['label'])
return example
# 6. Splitting the Dataset
dataset = dataset.map(map_label2id, batched=True)
dataset = dataset.cast_column('label', ClassLabels)
dataset = dataset.train_test_split(test_size=0.4, shuffle=True, stratify_by_column="label")
train_data = dataset['train']
test_data = dataset['test']
# 7. Setting Up the Model and Processor
model_str = "google/siglip2-base-patch16-224"
processor = AutoImageProcessor.from_pretrained(model_str)
# Extract preprocessing parameters
image_mean, image_std = processor.image_mean, processor.image_std
size = processor.size["height"]
# 8. Defining Data Transformations
# Define training transformations
_train_transforms = Compose([
Resize((size, size)),
RandomRotation(90),
RandomAdjustSharpness(2),
ToTensor(),
Normalize(mean=image_mean, std=image_std)
])
# Define validation transformations
_val_transforms = Compose([
Resize((size, size)),
ToTensor(),
Normalize(mean=image_mean, std=image_std)
])
# 9. Applying Transformations to the Dataset
# Apply transformations to dataset
def train_transforms(examples):
examples['pixel_values'] = [_train_transforms(image.convert("RGB")) for image in examples['image']]
return examples
def val_transforms(examples):
examples['pixel_values'] = [_val_transforms(image.convert("RGB")) for image in examples['image']]
return examples
# Assuming train_data and test_data are loaded datasets
train_data.set_transform(train_transforms)
test_data.set_transform(val_transforms)
# 10. Creating a Data Collator
def collate_fn(examples):
pixel_values = torch.stack([example["pixel_values"] for example in examples])
labels = torch.tensor([example['label'] for example in examples])
return {"pixel_values": pixel_values, "labels": labels}
# 11. Initializing the Model
model = SiglipForImageClassification.from_pretrained(model_str, num_labels=len(labels_list))
model.config.id2label = id2label
model.config.label2id = label2id
print(model.num_parameters(only_trainable=True) / 1e6)
# 12. Defining Metrics and the Compute Function
accuracy = evaluate.load("accuracy")
def compute_metrics(eval_pred):
predictions = eval_pred.predictions
label_ids = eval_pred.label_ids
predicted_labels = predictions.argmax(axis=1)
acc_score = accuracy.compute(predictions=predicted_labels, references=label_ids)['accuracy']
return {
"accuracy": acc_score
}
# 13. Setting Up Training Arguments
args = TrainingArguments(
output_dir="siglip2-image-classification/",
logging_dir='./logs',
evaluation_strategy="epoch",
learning_rate=2e-4,
per_device_train_batch_size=32,
per_device_eval_batch_size=8,
num_train_epochs=2,
weight_decay=0.02,
warmup_steps=50,
remove_unused_columns=False,
save_strategy='epoch',
load_best_model_at_end=True,
save_total_limit=4,
report_to="none"
)
# 14. Initializing the Trainer
trainer = Trainer(
model,
args,
train_dataset=train_data,
eval_dataset=test_data,
data_collator=collate_fn,
compute_metrics=compute_metrics,
tokenizer=processor,
)
# 15. Evaluating, Training, and Predicting
trainer.evaluate()
trainer.train()
trainer.evaluate()
outputs = trainer.predict(test_data)
print(outputs.metrics)
# 16. Computing Additional Metrics and Plotting the Confusion Matrix
y_true = outputs.label_ids
y_pred = outputs.predictions.argmax(1)
def plot_confusion_matrix(cm, classes, title='Confusion Matrix', cmap=plt.cm.Reds, figsize=(10, 8)):
plt.figure(figsize=figsize)
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=90)
plt.yticks(tick_marks, classes)
fmt = '.0f'
thresh = cm.max() / 2.0
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, format(cm[i, j], fmt), horizontalalignment="center", color="white" if cm[i, j] > thresh else "black")
plt.ylabel('True label')
plt.xlabel('Predicted label')
plt.tight_layout()
plt.show()
accuracy = accuracy_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred, average='macro')
print(f"Accuracy: {accuracy:.4f}")
print(f"F1 Score: {f1:.4f}")
if len(labels_list) <= 150:
cm = confusion_matrix(y_true, y_pred)
plot_confusion_matrix(cm, labels_list, figsize=(8, 6))
print()
print("Classification report:")
print()
print(classification_report(y_true, y_pred, target_names=labels_list, digits=4))
# 17. Saving the Model and Uploading to Hugging Face Hub
trainer.save_model() # Trained Model Uploaded to Hugging Face: https://huggingface.co/prithivMLmods/Mnist-Digits-SigLIP2
# 18. Login to Hugging Face Hub
#
# Use the Hugging Face Hub API to authenticate your session in the notebook.
# This allows you to push models, datasets, and other assets directly to your account.
#from huggingface_hub import notebook_login, HfApi
#notebook_login()
# 19. Upload the Model to Hugging Face Hub
#
# Once fine-tuning is complete, you can upload the trained SigLIP 2 model to the Hugging Face Hub.
# This enables sharing, versioning, and easy access for future inference or collaboration.
#api = HfApi()
#repo_id = f"prithivMLmods/Mnist-Digits-SigLIP2"
#api.upload_folder(
# folder_path="siglip2-image-classification/", # Local folder containing the fine-tuned model
# path_in_repo=".", # Path inside the Hugging Face repository
# repo_id=repo_id, # Repository ID (username/repo_name)
# repo_type="model", # Specify this is a model repository
# revision="main" # Branch or revision to push to
#)
#print(f"Model uploaded to https://huggingface.co/{repo_id}")
GitHub Gist (Code) : https://gist.github.com/PRITHIVSAKTHIUR/e3c67b9fbcaf397b6639b018d457fd08
[6.] Quick Start with Transformers
Install the packages
%%capture
!pip install gradio torch
!pip install transformers
!pip install huggingface_hub
Inference Code
import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch
# Load model and processor
model_name = "your-model-name-here" # Replace with your model path if different
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)
# Label mapping
id2label = { #Replace your label configuration correctly
"0": "map-label-1",
"1": "map-label-2",
"2": "map-label-3",
"3": "map-label-4",
"4": "map-label-5"
}
def classify_confidence(image):
image = Image.fromarray(image).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
prediction = {
id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
}
return prediction
# Gradio Interface
iface = gr.Interface(
fn=classify_confidence,
inputs=gr.Image(type="numpy"),
outputs=gr.Label(num_top_classes=2, label="Predicted Content Type"),
title="Your-App-Name-Here"
)
if __name__ == "__main__":
iface.launch()
[7.] Acknowledgements
| Resource / Reference | Description | Link |
|---|---|---|
| SigLIP 2 | Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features | arXiv:2502.14786 |
| Hugging Face Transformers | Open-source library for state-of-the-art NLP and vision models | GitHub |
| PyTorch | Deep learning framework used for training and model development | pytorch.org |
| Organization | StrangerGuardHF on Hugging Face | Visit |
| Model Page | Image-Guard-2.0-Post0.1 | Visit |
| Page | PrithivMLmods profile on Hugging Face | Visit |
[8.] Conclusion
Image-Guard-2.0 represents a significant step toward developing efficient, open-source image safety and content moderation systems. Built upon the SigLIP 2 vision-language architecture, it demonstrates strong multi-label classification capabilities across diverse visual domains — ranging from anime and realism to AI-generated and explicit content. Its lightweight design enables effective deployment in real-time scenarios, offering a balanced trade-off between performance, interpretability, and computational efficiency.
While the model performs robustly in most classification and moderation scenarios, it is important to note that Image-Guard-2.0 is an experimental start to the development of content moderation models, and therefore, it may exhibit sub-optimal performance in certain complex or ambiguous cases. Nevertheless, its adaptable framework, modular fine-tuning approach, and integration-ready design make it a valuable foundation for future advancements in AI-driven visual safety and ethical content filtering systems.





















