Papers
arxiv:2412.00064

DiffGuard: Text-Based Safety Checker for Diffusion Models

Published on Nov 25, 2024
Authors:
,
,
,
,
,

Abstract

A novel text-based safety filter, DiffGuard, improves upon existing solutions for filtering AI-generated images, enhancing efficacy by over 14%.

AI-generated summary

Recent advances in Diffusion Models have enabled the generation of images from text, with powerful closed-source models like DALL-E and Midjourney leading the way. However, open-source alternatives, such as StabilityAI's Stable Diffusion, offer comparable capabilities. These open-source models, hosted on Hugging Face, come equipped with ethical filter protections designed to prevent the generation of explicit images. This paper reveals first their limitations and then presents a novel text-based safety filter that outperforms existing solutions. Our research is driven by the critical need to address the misuse of AI-generated content, especially in the context of information warfare. DiffGuard enhances filtering efficacy, achieving a performance that surpasses the best existing filters by over 14%.

Community

Sign up or log in to comment

Models citing this paper 2

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2412.00064 in a dataset README.md to link it from this page.

Spaces citing this paper 3

Collections including this paper 1