Collections
Discover the best community collections!
Collections including paper arxiv:2408.00760 
						
					
				- 
	
	
	
Adding Conditional Control to Text-to-Image Diffusion Models
Paper • 2302.05543 • Published • 57 - 
	
	
	
Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention
Paper • 2408.00760 • Published • 8 - 
	
	
	
MagicQuill: An Intelligent Interactive Image Editing System
Paper • 2411.09703 • Published • 78 - 
	
	
	
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion
Paper • 2403.06976 • Published • 2 
- 
	
	
	
Dynamic Typography: Bringing Words to Life
Paper • 2404.11614 • Published • 46 - 
	
	
	
Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer
Paper • 2404.14351 • Published • 6 - 
	
	
	
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
Paper • 2404.17672 • Published • 19 - 
	
	
	
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 71 
- 
	
	
	
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Paper • 2310.03502 • Published • 78 - 
	
	
	
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Paper • 2404.07448 • Published • 12 - 
	
	
	
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models
Paper • 2404.07973 • Published • 32 - 
	
	
	
COCONut: Modernizing COCO Segmentation
Paper • 2404.08639 • Published • 30 
- 
	
	
	
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation
Paper • 2403.06775 • Published • 5 - 
	
	
	
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Paper • 2010.11929 • Published • 15 - 
	
	
	
Data Incubation -- Synthesizing Missing Data for Handwriting Recognition
Paper • 2110.07040 • Published • 2 - 
	
	
	
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks
Paper • 1811.00056 • Published • 2 
- 
	
	
	
Classifier-Free Diffusion Guidance
Paper • 2207.12598 • Published • 4 - 
	
	
	
Adding Conditional Control to Text-to-Image Diffusion Models
Paper • 2302.05543 • Published • 57 - 
	
	
	
Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models
Paper • 2404.07724 • Published • 14 - 
	
	
	
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 71 
- 
	
	
	
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Paper • 2310.03502 • Published • 78 - 
	
	
	
GLIGEN: Open-Set Grounded Text-to-Image Generation
Paper • 2301.07093 • Published • 4 - 
	
	
	
Music Consistency Models
Paper • 2404.13358 • Published • 14 - 
	
	
	
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Paper • 2404.14507 • Published • 23 
- 
	
	
	
On the Scalability of Diffusion-based Text-to-Image Generation
Paper • 2404.02883 • Published • 19 - 
	
	
	
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation
Paper • 2404.02733 • Published • 22 - 
	
	
	
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
Paper • 2404.03653 • Published • 36 - 
	
	
	
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback
Paper • 2404.07987 • Published • 48 
- 
	
	
	
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 - 
	
	
	
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 31 - 
	
	
	
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 22 - 
	
	
	
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69 
- 
	
	
	
Adding Conditional Control to Text-to-Image Diffusion Models
Paper • 2302.05543 • Published • 57 - 
	
	
	
Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention
Paper • 2408.00760 • Published • 8 - 
	
	
	
MagicQuill: An Intelligent Interactive Image Editing System
Paper • 2411.09703 • Published • 78 - 
	
	
	
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion
Paper • 2403.06976 • Published • 2 
- 
	
	
	
Classifier-Free Diffusion Guidance
Paper • 2207.12598 • Published • 4 - 
	
	
	
Adding Conditional Control to Text-to-Image Diffusion Models
Paper • 2302.05543 • Published • 57 - 
	
	
	
Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models
Paper • 2404.07724 • Published • 14 - 
	
	
	
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 71 
- 
	
	
	
Dynamic Typography: Bringing Words to Life
Paper • 2404.11614 • Published • 46 - 
	
	
	
Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer
Paper • 2404.14351 • Published • 6 - 
	
	
	
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
Paper • 2404.17672 • Published • 19 - 
	
	
	
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 71 
- 
	
	
	
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Paper • 2310.03502 • Published • 78 - 
	
	
	
GLIGEN: Open-Set Grounded Text-to-Image Generation
Paper • 2301.07093 • Published • 4 - 
	
	
	
Music Consistency Models
Paper • 2404.13358 • Published • 14 - 
	
	
	
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Paper • 2404.14507 • Published • 23 
- 
	
	
	
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Paper • 2310.03502 • Published • 78 - 
	
	
	
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Paper • 2404.07448 • Published • 12 - 
	
	
	
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models
Paper • 2404.07973 • Published • 32 - 
	
	
	
COCONut: Modernizing COCO Segmentation
Paper • 2404.08639 • Published • 30 
- 
	
	
	
On the Scalability of Diffusion-based Text-to-Image Generation
Paper • 2404.02883 • Published • 19 - 
	
	
	
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation
Paper • 2404.02733 • Published • 22 - 
	
	
	
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
Paper • 2404.03653 • Published • 36 - 
	
	
	
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback
Paper • 2404.07987 • Published • 48 
- 
	
	
	
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation
Paper • 2403.06775 • Published • 5 - 
	
	
	
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Paper • 2010.11929 • Published • 15 - 
	
	
	
Data Incubation -- Synthesizing Missing Data for Handwriting Recognition
Paper • 2110.07040 • Published • 2 - 
	
	
	
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks
Paper • 1811.00056 • Published • 2 
- 
	
	
	
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 - 
	
	
	
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 31 - 
	
	
	
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 22 - 
	
	
	
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69