Image-Guard-2.0: A SigLIP 2 Based Image Safety Classification Model

Community Article Published October 14, 2025

[1.] Introduction

Image-Guard-2.0 is an experimental, lightweight vision-language encoder model with a size of 0.1B (<100M parameters), trained on the SigLIP2 (siglip2-base-patch16-224). Designed for multi-label image classification tasks, this model functions as an image safety system, acting as an image guard or moderator across a wide range of categories, from anime to realistic imagery. It also performs strict moderation and filtering for artificially synthesized content, particularly demonstrating strong detection and handling of explicit images. Image-Guard-2.0 exhibits robust performance in streamlined scenarios, ensuring reliable and effective classification across diverse visual inputs.

[2.] Image-Guard 2.0 and Its Dataset Composition Sources

Image-Guard 2.0 is trained on a diverse collection of datasets. For Anime-SFW (Safe-for-Work) images, the dataset includes curated safe entries from anime series, manga, and various art works available on the web. For Normal-SFW (Safe-for-Work) images, the collection encompasses safe realism, portraits, black-and-white images, realistic sketches, AI-generated realistic images, doodles, paintings, and more. Explicit content, including Hentai, Enticing or Sensual, and Pornography, is carefully sourced from publicly available datasets on Hugging Face Hub, Kaggle, and other open-access repositories. These combined datasets provide a comprehensive foundation for training Image-Guard 2.0 to effectively classify and moderate diverse image types.

[3.] What Sets Image-Guard 2.0 Apart

The most downloaded open-source model, FalconsAI/nsfw_image_detection, provides a solid foundation for NSFW image classification but is less moderate toward publicly sensual images, portraits, and a wide range of image categories. FalconsAI’s model, built on the Vision Transformer (ViT) architecture, is fine-tuned on a proprietary dataset of approximately 80,000 images and classifies them simply into “normal” and “nsfw” categories, making it effective for general content moderation. In contrast, Image-Guard-2.0 distinguishes itself through its experimental vision-language encoder design based on SigLIP 2 and multi-label classification capabilities. Trained on a diverse and extensive dataset spanning anime, realism, synthetic, and AI-generated images, it allows for nuanced moderation across multiple content types. Its architecture and training methodology are optimized for lightweight deployment, offering efficient performance without compromising accuracy, making Image-Guard 2.0 particularly adept for real-time content moderation across complex and diverse image scenarios.

[4.] Sample Inferences for Image-Guard-2.0

The following set of images showcases examples of demo inferences performed by Image-Guard-2.0. These images highlight the model’s ability to classify and moderate content across various categories

Example Inference (Image-Guard-2.0)

Explicit or unsafe images were blurred/pixelated for safety purposes.

[5.] Fine-tune SigLIP2 (Domain-Specific Downstream Tasks)

The script below is used for fine-tuning SigLIP2 foundational models on single- or multi-label image classification tasks.

Install the packages

%%capture
!pip install evaluate datasets accelerate
!pip install transformers==4.50.0 torchvision
!pip install huggingface-hub hf_xet
#Hold tight, this will take around 2-3 minutes.

Code (run_finetune_siglip2.py)

# Fine-Tuning SigLIP2 for Image Classification | Script prepared by: hf.co/prithivMLmods
#
# Dataset with Train & Test Splits
#
# In this configuration, the dataset is already organized into separate training and testing splits. This setup is ideal for straightforward supervised learning workflows.
#
# Training Phase:
# The model is fine-tuned exclusively on the train split, where each image is paired with its corresponding class label.
#
# Evaluation Phase:
# After training, the model's performance is assessed on the test split to measure generalization accuracy.

# 1. Install the packages

# %%capture
# Install required libraries for fine-tuning SigLIP 2
# 'evaluate' - for computing evaluation metrics like accuracy
# 'datasets' - for handling train/test splits and loading image datasets
# 'accelerate' - for efficient multi-GPU / multi-CPU training
# 'transformers==4.50.0' - for SigLIP 2 and other transformer-based models
# 'torchvision' - for image preprocessing and augmentations
# 'huggingface-hub' - for model and dataset uploads/downloads
# 'hf_xet' - optional helper for versioned dataset/model storage on Hugging Face Hub
#  Hold tight, this will take around 2-3 minutes.

# If you are performing the training process outside of Google Colaboratory, install: imbalanced-learn

# --------------------------------------------------------------------------

# To demonstrate the fine-tuning process, we will use the MNIST dataset — a classic benchmark for image classification.
# MNIST consists of 28x28 grayscale images of handwritten digits (0–9), making it ideal for testing model training pipelines.
# We will load the dataset directly from the Hugging Face Hub using the 'datasets' library.
# Dataset link: https://huggingface.co/datasets/ylecun/mnist

# --------------------------------------------------------------------------

# 2. Import modules required for data manipulation, model training, and image preprocessing.

import warnings
warnings.filterwarnings("ignore")

import gc
import numpy as np
import pandas as pd
import itertools
from collections import Counter
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score, roc_auc_score, confusion_matrix, classification_report, f1_score
from imblearn.over_sampling import RandomOverSampler
import evaluate
from datasets import Dataset, Image, ClassLabel
from transformers import (
    TrainingArguments,
    Trainer,
    DefaultDataCollator
)

from transformers import AutoImageProcessor
from transformers import SiglipForImageClassification
from transformers.image_utils import load_image

import torch
from torch.utils.data import DataLoader
from torchvision.transforms import (
    CenterCrop,
    Compose,
    Normalize,
    RandomRotation,
    RandomResizedCrop,
    RandomHorizontalFlip,
    RandomAdjustSharpness,
    Resize,
    ToTensor
)

from PIL import Image, ExifTags
from PIL import Image as PILImage
from PIL import ImageFile
# Enable loading truncated images
ImageFile.LOAD_TRUNCATED_IMAGES = True

# 3. Loading and Preparing the Dataset

from datasets import load_dataset
dataset = load_dataset("ylecun/mnist", split="train")

from pathlib import Path

file_names = []
labels = []

for example in dataset:
    file_path = str(example['image'])
    label = example['label']

    file_names.append(file_path)
    labels.append(label)

print(len(file_names), len(labels))

# 4. Creating a DataFrame and Balancing the Dataset & Working with a Subset of Labels

df = pd.DataFrame.from_dict({"image": file_names, "label": labels})
print(df.shape)

df.head()
df['label'].unique()

y = df[['label']]
df = df.drop(['label'], axis=1)
ros = RandomOverSampler(random_state=83)
df, y_resampled = ros.fit_resample(df, y)
del y
df['label'] = y_resampled
del y_resampled
gc.collect()

labels_subset = labels[:5]
print(labels_subset)

#labels_list = ['example_label_0', 'example_label_1'................,'example_label_n-1']
labels_list = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

label2id, id2label = {}, {}
for i, label in enumerate(labels_list):
    label2id[label] = i
    id2label[i] = label

ClassLabels = ClassLabel(num_classes=len(labels_list), names=labels_list)

print("Mapping of IDs to Labels:", id2label, '\n')
print("Mapping of Labels to IDs:", label2id)
     
# 5. Mapping and Casting Labels

def map_label2id(example):
    example['label'] = ClassLabels.str2int(example['label'])
    return example

# 6. Splitting the Dataset

dataset = dataset.map(map_label2id, batched=True)
dataset = dataset.cast_column('label', ClassLabels)
dataset = dataset.train_test_split(test_size=0.4, shuffle=True, stratify_by_column="label")

train_data = dataset['train']
test_data = dataset['test']

# 7. Setting Up the Model and Processor


model_str = "google/siglip2-base-patch16-224"
processor = AutoImageProcessor.from_pretrained(model_str)

# Extract preprocessing parameters
image_mean, image_std = processor.image_mean, processor.image_std
size = processor.size["height"]

# 8. Defining Data Transformations

# Define training transformations
_train_transforms = Compose([
    Resize((size, size)),
    RandomRotation(90),
    RandomAdjustSharpness(2),
    ToTensor(),
    Normalize(mean=image_mean, std=image_std)
])

# Define validation transformations
_val_transforms = Compose([
    Resize((size, size)),
    ToTensor(),
    Normalize(mean=image_mean, std=image_std)
])

# 9. Applying Transformations to the Dataset


# Apply transformations to dataset
def train_transforms(examples):
    examples['pixel_values'] = [_train_transforms(image.convert("RGB")) for image in examples['image']]
    return examples

def val_transforms(examples):
    examples['pixel_values'] = [_val_transforms(image.convert("RGB")) for image in examples['image']]
    return examples

# Assuming train_data and test_data are loaded datasets
train_data.set_transform(train_transforms)
test_data.set_transform(val_transforms)

# 10. Creating a Data Collator

def collate_fn(examples):
    pixel_values = torch.stack([example["pixel_values"] for example in examples])
    labels = torch.tensor([example['label'] for example in examples])
    return {"pixel_values": pixel_values, "labels": labels}

# 11. Initializing the Model

model = SiglipForImageClassification.from_pretrained(model_str, num_labels=len(labels_list))
model.config.id2label = id2label
model.config.label2id = label2id

print(model.num_parameters(only_trainable=True) / 1e6)

# 12. Defining Metrics and the Compute Function

accuracy = evaluate.load("accuracy")

def compute_metrics(eval_pred):
    predictions = eval_pred.predictions
    label_ids = eval_pred.label_ids

    predicted_labels = predictions.argmax(axis=1)
    acc_score = accuracy.compute(predictions=predicted_labels, references=label_ids)['accuracy']

    return {
        "accuracy": acc_score
    }

# 13. Setting Up Training Arguments

args = TrainingArguments(
    output_dir="siglip2-image-classification/",
    logging_dir='./logs',
    evaluation_strategy="epoch",
    learning_rate=2e-4,
    per_device_train_batch_size=32,
    per_device_eval_batch_size=8,
    num_train_epochs=2,
    weight_decay=0.02,
    warmup_steps=50,
    remove_unused_columns=False,
    save_strategy='epoch',
    load_best_model_at_end=True,
    save_total_limit=4,
    report_to="none"
)

# 14. Initializing the Trainer

trainer = Trainer(
    model,
    args,
    train_dataset=train_data,
    eval_dataset=test_data,
    data_collator=collate_fn,
    compute_metrics=compute_metrics,
    tokenizer=processor,
)
     
# 15. Evaluating, Training, and Predicting


trainer.evaluate()

trainer.train()

trainer.evaluate()

outputs = trainer.predict(test_data)
print(outputs.metrics)

# 16. Computing Additional Metrics and Plotting the Confusion Matrix

y_true = outputs.label_ids
y_pred = outputs.predictions.argmax(1)

def plot_confusion_matrix(cm, classes, title='Confusion Matrix', cmap=plt.cm.Reds, figsize=(10, 8)):

    plt.figure(figsize=figsize)

    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    plt.title(title)
    plt.colorbar()

    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, rotation=90)
    plt.yticks(tick_marks, classes)

    fmt = '.0f'
    thresh = cm.max() / 2.0
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        plt.text(j, i, format(cm[i, j], fmt), horizontalalignment="center", color="white" if cm[i, j] > thresh else "black")

    plt.ylabel('True label')
    plt.xlabel('Predicted label')
    plt.tight_layout()
    plt.show()

accuracy = accuracy_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred, average='macro')

print(f"Accuracy: {accuracy:.4f}")
print(f"F1 Score: {f1:.4f}")

if len(labels_list) <= 150:
    cm = confusion_matrix(y_true, y_pred)
    plot_confusion_matrix(cm, labels_list, figsize=(8, 6))

print()
print("Classification report:")
print()
print(classification_report(y_true, y_pred, target_names=labels_list, digits=4))

# 17. Saving the Model and Uploading to Hugging Face Hub

trainer.save_model() # Trained Model Uploaded to Hugging Face: https://huggingface.co/prithivMLmods/Mnist-Digits-SigLIP2
     
# 18. Login to Hugging Face Hub
#
# Use the Hugging Face Hub API to authenticate your session in the notebook.
# This allows you to push models, datasets, and other assets directly to your account.

#from huggingface_hub import notebook_login, HfApi
#notebook_login()

# 19. Upload the Model to Hugging Face Hub
#
# Once fine-tuning is complete, you can upload the trained SigLIP 2 model to the Hugging Face Hub.
# This enables sharing, versioning, and easy access for future inference or collaboration.

#api = HfApi()
#repo_id = f"prithivMLmods/Mnist-Digits-SigLIP2"

#api.upload_folder(
#    folder_path="siglip2-image-classification/",  # Local folder containing the fine-tuned model
#    path_in_repo=".",                             # Path inside the Hugging Face repository
#    repo_id=repo_id,                              # Repository ID (username/repo_name)
#    repo_type="model",                            # Specify this is a model repository
#    revision="main"                               # Branch or revision to push to
#)

#print(f"Model uploaded to https://huggingface.co/{repo_id}")

GitHub Gist (Code) : https://gist.github.com/PRITHIVSAKTHIUR/e3c67b9fbcaf397b6639b018d457fd08

[6.] Quick Start with Transformers

Install the packages

%%capture
!pip install gradio torch
!pip install transformers
!pip install huggingface_hub

Inference Code

import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch

# Load model and processor
model_name = "your-model-name-here"  # Replace with your model path if different
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)

# Label mapping
id2label = { #Replace your label configuration correctly
    "0": "map-label-1",
    "1": "map-label-2",
    "2": "map-label-3",
    "3": "map-label-4",
    "4": "map-label-5"
}

def classify_confidence(image):
    image = Image.fromarray(image).convert("RGB")
    inputs = processor(images=image, return_tensors="pt")

    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()

    prediction = {
        id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
    }

    return prediction

# Gradio Interface
iface = gr.Interface(
    fn=classify_confidence,
    inputs=gr.Image(type="numpy"),
    outputs=gr.Label(num_top_classes=2, label="Predicted Content Type"),
    title="Your-App-Name-Here"
)

if __name__ == "__main__":
    iface.launch()

[7.] Acknowledgements

Resource / Reference	Description	Link
SigLIP 2	Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features	arXiv:2502.14786
Hugging Face Transformers	Open-source library for state-of-the-art NLP and vision models	GitHub
PyTorch	Deep learning framework used for training and model development	pytorch.org
Organization	StrangerGuardHF on Hugging Face	Visit
Model Page	Image-Guard-2.0-Post0.1	Visit
Page	PrithivMLmods profile on Hugging Face	Visit

[8.] Conclusion

Image-Guard-2.0 represents a significant step toward developing efficient, open-source image safety and content moderation systems. Built upon the SigLIP 2 vision-language architecture, it demonstrates strong multi-label classification capabilities across diverse visual domains — ranging from anime and realism to AI-generated and explicit content. Its lightweight design enables effective deployment in real-time scenarios, offering a balanced trade-off between performance, interpretability, and computational efficiency.

While the model performs robustly in most classification and moderation scenarios, it is important to note that Image-Guard-2.0 is an experimental start to the development of content moderation models, and therefore, it may exhibit sub-optimal performance in certain complex or ambiguous cases. Nevertheless, its adaptable framework, modular fine-tuning approach, and integration-ready design make it a valuable foundation for future advancements in AI-driven visual safety and ethical content filtering systems.

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote