SliderSpace: Decomposing the Visual Capabilities of Diffusion Models
Abstract
SliderSpace is a framework that automatically finds multiple interpretable directions in diffusion models from a single text prompt, enabling better control and discovery of model capabilities across various applications.
We present SliderSpace, a framework for automatically decomposing the visual capabilities of diffusion models into controllable and human-understandable directions. Unlike existing control methods that require a user to specify attributes for each edit direction individually, SliderSpace discovers multiple interpretable and diverse directions simultaneously from a single text prompt. Each direction is trained as a low-rank adaptor, enabling compositional control and the discovery of surprising possibilities in the model's latent space. Through extensive experiments on state-of-the-art diffusion models, we demonstrate SliderSpace's effectiveness across three applications: concept decomposition, artistic style exploration, and diversity enhancement. Our quantitative evaluation shows that SliderSpace-discovered directions decompose the visual structure of model's knowledge effectively, offering insights into the latent capabilities encoded within diffusion models. User studies further validate that our method produces more diverse and useful variations compared to baselines. Our code, data and trained weights are available at https://sliderspace.baulab.info
Community
We propose an unsupervised method to discover creative slider directions by unlocking the model's knowledge about a concept.
All you need to provide is a single prompt "toy" and you can discover 100s of sliders
These are the directions that diffusion model thinks is creative and interesting about a "toy".
Now you can also use this for exploration of model's knowledge. For instance here are some directions of art styles from SDXL that we explored using a single prompt "art in the style of a famous artist"
You can discover directions in any model using SliderSpace - SDv1.4, SDXL, SDXL-Turbo (DMD, Lightning, LCM), FLUX, Pix-art, .... (many more). It's time to unlock the model's creativity and explore!
This is awesome!
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- DebiasDiff: Debiasing Text-to-image Diffusion Models with Self-discovering Latent Attribute Directions (2024)
- OmniPrism: Learning Disentangled Visual Concept for Image Generation (2024)
- DECOR:Decomposition and Projection of Text Embeddings for Text-to-Image Customization (2024)
- Exploring the latent space of diffusion models directly through singular value decomposition (2025)
- TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space (2025)
- Hierarchical Vision-Language Alignment for Text-to-Image Generation via Diffusion Models (2025)
- A LoRA is Worth a Thousand Pictures (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
 You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: 
@librarian-bot
	 recommend
๐จ We are excited to announce SliderSpace official code release!! ๐
An obvious optimization is decoding latents with taesdxl/taef1 which are two orders of magnitude smaller than the original VAEs
yes! we are using taesdxl for training SDXL!
But thanks for pointing to taefl!!
We will update and optimize FLUX training too!
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper


 
						 
						 
						