prithivMLmods commited on
Commit
b47dcde
·
verified ·
1 Parent(s): 9f14ba2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +571 -3
README.md CHANGED
@@ -11,9 +11,9 @@ base_model:
11
  base_model_relation: merge
12
  pipeline_tag: text-to-image
13
  ---
14
- # **Flux.1-krea-Merge-Transformer (Flux.1-Dev + Flux.1-Krea-Dev)**
15
 
16
- > The Flux.1-krea-Merge-Transformer repository contains merged parameters combining two advanced image generation models: black-forest-labs/FLUX.1-dev and black-forest-labs/FLUX.1-Krea-dev. This merged model integrates the capabilities of the rectified flow transformer FLUX.1-dev, known for competitive prompt following and high-quality outputs, with FLUX.1-Krea-dev, a guidance distilled model emphasizing aesthetics and photorealism. The result is a unified model that balances quality, aesthetic control, and efficiency for text-to-image generation tasks. The repository includes instructions for loading, merging, and using the fused parameters via the Diffusers library, enabling users to generate images from text prompts through the FluxPipeline with enhanced performance and visual quality. This merge facilitates leveraging strengths from both base models in a single, accessible implementation for research and creative workflows.
17
 
18
  ## **Sub-Memory-efficient merging code (Flux.1-Dev + Flux.1-Krea-Dev)**
19
 
@@ -106,7 +106,7 @@ model.to(torch.bfloat16).save_pretrained("merged/transformer")
106
 
107
  ```py
108
  api = HfApi()
109
- repo_id = "prithivMLmods/Flux.1-krea-Merge-Transformer"
110
 
111
  api.upload_folder(
112
  folder_path="merged/",
@@ -117,6 +117,574 @@ api.upload_folder(
117
  )
118
  ```
119
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
120
  ## For more information, visit the documentation.
121
 
122
  > Flux is a suite of state-of-the-art text-to-image generation models based on diffusion transformers, developed by Black Forest Labs. The models are designed for high-quality generative image tasks, including text-to-image, inpainting, outpainting, and advanced structure or depth-controlled workflows. Flux is available through the Hugging Face diffusers library.
 
11
  base_model_relation: merge
12
  pipeline_tag: text-to-image
13
  ---
14
+ # **Flux.1-Krea-Merged-Dev (Flux.1-Dev + Flux.1-Krea-Dev)**
15
 
16
+ > The Flux.1-Krea-Merged-Dev repository contains merged parameters combining two advanced image generation models: black-forest-labs/FLUX.1-dev and black-forest-labs/FLUX.1-Krea-dev. This merged model integrates the capabilities of the rectified flow transformer FLUX.1-dev, known for competitive prompt following and high-quality outputs, with FLUX.1-Krea-dev, a guidance distilled model emphasizing aesthetics and photorealism. The result is a unified model that balances quality, aesthetic control, and efficiency for text-to-image generation tasks. The repository includes instructions for loading, merging, and using the fused parameters via the Diffusers library, enabling users to generate images from text prompts through the FluxPipeline with enhanced performance and visual quality. This merge facilitates leveraging strengths from both base models in a single, accessible implementation for research and creative workflows.
17
 
18
  ## **Sub-Memory-efficient merging code (Flux.1-Dev + Flux.1-Krea-Dev)**
19
 
 
106
 
107
  ```py
108
  api = HfApi()
109
+ repo_id = "prithivMLmods/Flux.1-Krea-Merged-Dev"
110
 
111
  api.upload_folder(
112
  folder_path="merged/",
 
117
  )
118
  ```
119
 
120
+
121
+ ## Quick Start with Transformers and Gradio
122
+
123
+ > [!NOTE]
124
+ COMPARATOR : FLUX.1-Dev(Realism) and FLUX.1-Krea-Merged-Dev (Flux.1-Dev + Flux.1-Krea-Dev)
125
+
126
+ **Installing Required Packages**
127
+
128
+ ```py
129
+ %%capture
130
+ !pip install git+https://github.com/huggingface/transformers.git
131
+ !pip install git+https://github.com/huggingface/diffusers.git
132
+ !pip install git+https://github.com/huggingface/peft.git
133
+ !pip install git+https://github.com/huggingface/accelerate.git
134
+ !pip install safetensors huggingface_hub hf_xet
135
+ ```
136
+
137
+ **hf-login**
138
+
139
+ ```
140
+ from huggingface_hub import notebook_login, HfApi
141
+ notebook_login()
142
+ ```
143
+
144
+ ---
145
+
146
+ `@hardware-accelerator : H200`
147
+
148
+ <details>
149
+ <summary>app.py</summary>
150
+
151
+ ```py
152
+ import spaces
153
+ import gradio as gr
154
+ import torch
155
+ from PIL import Image
156
+ from diffusers import DiffusionPipeline, AutoencoderTiny, AutoencoderKL
157
+ import random
158
+ import uuid
159
+ from typing import Tuple, Union, List, Optional, Any, Dict
160
+ import numpy as np
161
+ import time
162
+ import zipfile
163
+ from transformers import CLIPTextModel, CLIPTokenizer, T5EncoderModel, T5TokenizerFast
164
+
165
+ # Description for the app
166
+ DESCRIPTION = """## flux comparator hpc/."""
167
+
168
+ # Helper functions
169
+ def save_image(img):
170
+ unique_name = str(uuid.uuid4()) + ".png"
171
+ img.save(unique_name)
172
+ return unique_name
173
+
174
+ def randomize_seed_fn(seed: int, randomize_seed: bool) -> int:
175
+ if randomize_seed:
176
+ seed = random.randint(0, MAX_SEED)
177
+ return seed
178
+
179
+ MAX_SEED = np.iinfo(np.int32).max
180
+ MAX_IMAGE_SIZE = 2048
181
+
182
+ # Load pipelines for both models
183
+ # Flux.1-dev-realism
184
+ base_model_dev = "black-forest-labs/FLUX.1-dev"
185
+ pipe_dev = DiffusionPipeline.from_pretrained(base_model_dev, torch_dtype=torch.bfloat16)
186
+ lora_repo = "strangerzonehf/Flux-Super-Realism-LoRA"
187
+ trigger_word = "Super Realism"
188
+ pipe_dev.load_lora_weights(lora_repo)
189
+ pipe_dev.to("cuda")
190
+
191
+ # Flux.1-krea
192
+ dtype = torch.bfloat16
193
+ device = "cuda" if torch.cuda.is_available() else "cpu"
194
+ taef1 = AutoencoderTiny.from_pretrained("madebyollin/taef1", torch_dtype=dtype).to(device)
195
+ good_vae = AutoencoderKL.from_pretrained("prithivMLmods/Flux.1-Krea-Merged-Dev", subfolder="vae", torch_dtype=dtype).to(device)
196
+ pipe_krea = DiffusionPipeline.from_pretrained("prithivMLmods/Flux.1-Krea-Merged-Dev", torch_dtype=dtype, vae=taef1).to(device)
197
+
198
+ # Define the flux_pipe_call_that_returns_an_iterable_of_images for flux.1-krea
199
+ @torch.inference_mode()
200
+ def flux_pipe_call_that_returns_an_iterable_of_images(
201
+ self,
202
+ prompt: Union[str, List[str]] = None,
203
+ prompt_2: Optional[Union[str, List[str]]] = None,
204
+ height: Optional[int] = None,
205
+ width: Optional[int] = None,
206
+ num_inference_steps: int = 28,
207
+ timesteps: List[int] = None,
208
+ guidance_scale: float = 3.5,
209
+ num_images_per_prompt: Optional[int] = 1,
210
+ generator: Optional[Union[torch.Generator, List[torch.Generator]]] = None,
211
+ latents: Optional[torch.FloatTensor] = None,
212
+ prompt_embeds: Optional[torch.FloatTensor] = None,
213
+ pooled_prompt_embeds: Optional[torch.FloatTensor] = None,
214
+ output_type: Optional[str] = "pil",
215
+ return_dict: bool = True,
216
+ joint_attention_kwargs: Optional[Dict[str, Any]] = None,
217
+ max_sequence_length: int = 512,
218
+ good_vae: Optional[Any] = None,
219
+ ):
220
+ height = height or self.default_sample_size * self.vae_scale_factor
221
+ width = width or self.default_sample_size * self.vae_scale_factor
222
+
223
+ self.check_inputs(
224
+ prompt,
225
+ prompt_2,
226
+ height,
227
+ width,
228
+ prompt_embeds=prompt_embeds,
229
+ pooled_prompt_embeds=pooled_prompt_embeds,
230
+ max_sequence_length=max_sequence_length,
231
+ )
232
+
233
+ self._guidance_scale = guidance_scale
234
+ self._joint_attention_kwargs = joint_attention_kwargs
235
+ self._interrupt = False
236
+
237
+ batch_size = 1 if isinstance(prompt, str) else len(prompt)
238
+ device = self._execution_device
239
+
240
+ lora_scale = joint_attention_kwargs.get("scale", None) if joint_attention_kwargs is not None else None
241
+ prompt_embeds, pooled_prompt_embeds, text_ids = self.encode_prompt(
242
+ prompt=prompt,
243
+ prompt_2=prompt_2,
244
+ prompt_embeds=prompt_embeds,
245
+ pooled_prompt_embeds=pooled_prompt_embeds,
246
+ device=device,
247
+ num_images_per_prompt=num_images_per_prompt,
248
+ max_sequence_length=max_sequence_length,
249
+ lora_scale=lora_scale,
250
+ )
251
+
252
+ num_channels_latents = self.transformer.config.in_channels // 4
253
+ latents, latent_image_ids = self.prepare_latents(
254
+ batch_size * num_images_per_prompt,
255
+ num_channels_latents,
256
+ height,
257
+ width,
258
+ prompt_embeds.dtype,
259
+ device,
260
+ generator,
261
+ latents,
262
+ )
263
+
264
+ sigmas = np.linspace(1.0, 1 / num_inference_steps, num_inference_steps)
265
+ image_seq_len = latents.shape[1]
266
+ mu = calculate_shift(
267
+ image_seq_len,
268
+ self.scheduler.config.base_image_seq_len,
269
+ self.scheduler.config.max_image_seq_len,
270
+ self.scheduler.config.base_shift,
271
+ self.scheduler.config.max_shift,
272
+ )
273
+ timesteps, num_inference_steps = retrieve_timesteps(
274
+ self.scheduler,
275
+ num_inference_steps,
276
+ device,
277
+ timesteps,
278
+ sigmas,
279
+ mu=mu,
280
+ )
281
+ self._num_timesteps = len(timesteps)
282
+
283
+ guidance = torch.full([1], guidance_scale, device=device, dtype=torch.float32).expand(latents.shape[0]) if self.transformer.config.guidance_embeds else None
284
+
285
+ for i, t in enumerate(timesteps):
286
+ if self.interrupt:
287
+ continue
288
+
289
+ timestep = t.expand(latents.shape[0]).to(latents.dtype)
290
+
291
+ noise_pred = self.transformer(
292
+ hidden_states=latents,
293
+ timestep=timestep / 1000,
294
+ guidance=guidance,
295
+ pooled_projections=pooled_prompt_embeds,
296
+ encoder_hidden_states=prompt_embeds,
297
+ txt_ids=text_ids,
298
+ img_ids=latent_image_ids,
299
+ joint_attention_kwargs=self.joint_attention_kwargs,
300
+ return_dict=False,
301
+ )[0]
302
+
303
+ latents_for_image = self._unpack_latents(latents, height, width, self.vae_scale_factor)
304
+ latents_for_image = (latents_for_image / self.vae.config.scaling_factor) + self.vae.config.shift_factor
305
+ image = self.vae.decode(latents_for_image, return_dict=False)[0]
306
+ yield self.image_processor.postprocess(image, output_type=output_type)[0]
307
+
308
+ latents = self.scheduler.step(noise_pred, t, latents, return_dict=False)[0]
309
+ torch.cuda.empty_cache()
310
+
311
+ latents = self._unpack_latents(latents, height, width, self.vae_scale_factor)
312
+ latents = (latents / good_vae.config.scaling_factor) + good_vae.config.shift_factor
313
+ image = good_vae.decode(latents, return_dict=False)[0]
314
+ self.maybe_free_model_hooks()
315
+ torch.cuda.empty_cache()
316
+ yield self.image_processor.postprocess(image, output_type=output_type)[0]
317
+
318
+ pipe_krea.flux_pipe_call_that_returns_an_iterable_of_images = flux_pipe_call_that_returns_an_iterable_of_images.__get__(pipe_krea)
319
+
320
+ # Helper functions for flux.1-krea
321
+ def calculate_shift(
322
+ image_seq_len,
323
+ base_seq_len: int = 256,
324
+ max_seq_len: int = 4096,
325
+ base_shift: float = 0.5,
326
+ max_shift: float = 1.16,
327
+ ):
328
+ m = (max_shift - base_shift) / (max_seq_len - base_seq_len)
329
+ b = base_shift - m * base_seq_len
330
+ mu = image_seq_len * m + b
331
+ return mu
332
+
333
+ def retrieve_timesteps(
334
+ scheduler,
335
+ num_inference_steps: Optional[int] = None,
336
+ device: Optional[Union[str, torch.device]] = None,
337
+ timesteps: Optional[List[int]] = None,
338
+ sigmas: Optional[List[float]] = None,
339
+ **kwargs,
340
+ ):
341
+ if timesteps is not None and sigmas is not None:
342
+ raise ValueError("Only one of `timesteps` or `sigmas` can be passed.")
343
+ if timesteps is not None:
344
+ scheduler.set_timesteps(timesteps=timesteps, device=device, **kwargs)
345
+ timesteps = scheduler.timesteps
346
+ num_inference_steps = len(timesteps)
347
+ elif sigmas is not None:
348
+ scheduler.set_timesteps(sigmas=sigmas, device=device, **kwargs)
349
+ timesteps = scheduler.timesteps
350
+ num_inference_steps = len(timesteps)
351
+ else:
352
+ scheduler.set_timesteps(num_inference_steps, device=device, **kwargs)
353
+ timesteps = scheduler.timesteps
354
+ return timesteps, num_inference_steps
355
+
356
+ # Styles for flux.1-dev-realism
357
+ style_list = [
358
+ {"name": "3840 x 2160", "prompt": "hyper-realistic 8K image of {prompt}. ultra-detailed, lifelike, high-resolution, sharp, vibrant colors, photorealistic", "negative_prompt": ""},
359
+ {"name": "2560 x 1440", "prompt": "hyper-realistic 4K image of {prompt}. ultra-detailed, lifelike, high-resolution, sharp, vibrant colors, photorealistic", "negative_prompt": ""},
360
+ {"name": "HD+", "prompt": "hyper-realistic 2K image of {prompt}. ultra-detailed, lifelike, high-resolution, sharp, vibrant colors, photorealistic", "negative_prompt": ""},
361
+ {"name": "Style Zero", "prompt": "{prompt}", "negative_prompt": ""},
362
+ ]
363
+
364
+ styles = {k["name"]: (k["prompt"], k["negative_prompt"]) for k in style_list}
365
+ DEFAULT_STYLE_NAME = "3840 x 2160"
366
+ STYLE_NAMES = list(styles.keys())
367
+
368
+ def apply_style(style_name: str, positive: str) -> Tuple[str, str]:
369
+ p, n = styles.get(style_name, styles[DEFAULT_STYLE_NAME])
370
+ return p.replace("{prompt}", positive), n
371
+
372
+ # Generation function for flux.1-dev-realism
373
+ @spaces.GPU
374
+ def generate_dev(
375
+ prompt: str,
376
+ negative_prompt: str = "",
377
+ use_negative_prompt: bool = False,
378
+ seed: int = 0,
379
+ width: int = 1024,
380
+ height: int = 1024,
381
+ guidance_scale: float = 3,
382
+ randomize_seed: bool = False,
383
+ style_name: str = DEFAULT_STYLE_NAME,
384
+ num_inference_steps: int = 30,
385
+ num_images: int = 1,
386
+ zip_images: bool = False,
387
+ progress=gr.Progress(track_tqdm=True),
388
+ ):
389
+ positive_prompt, style_negative_prompt = apply_style(style_name, prompt)
390
+
391
+ if use_negative_prompt:
392
+ final_negative_prompt = style_negative_prompt + " " + negative_prompt
393
+ else:
394
+ final_negative_prompt = style_negative_prompt
395
+
396
+ final_negative_prompt = final_negative_prompt.strip()
397
+
398
+ if trigger_word:
399
+ positive_prompt = f"{trigger_word} {positive_prompt}"
400
+
401
+ seed = int(randomize_seed_fn(seed, randomize_seed))
402
+ generator = torch.Generator(device="cuda").manual_seed(seed)
403
+
404
+ start_time = time.time()
405
+
406
+ images = pipe_dev(
407
+ prompt=positive_prompt,
408
+ negative_prompt=final_negative_prompt if final_negative_prompt else None,
409
+ width=width,
410
+ height=height,
411
+ guidance_scale=guidance_scale,
412
+ num_inference_steps=num_inference_steps,
413
+ num_images_per_prompt=num_images,
414
+ generator=generator,
415
+ output_type="pil",
416
+ ).images
417
+
418
+ end_time = time.time()
419
+ duration = end_time - start_time
420
+
421
+ image_paths = [save_image(img) for img in images]
422
+
423
+ zip_path = None
424
+ if zip_images:
425
+ zip_name = str(uuid.uuid4()) + ".zip"
426
+ with zipfile.ZipFile(zip_name, 'w') as zipf:
427
+ for i, img_path in enumerate(image_paths):
428
+ zipf.write(img_path, arcname=f"Img_{i}.png")
429
+ zip_path = zip_name
430
+
431
+ return image_paths, seed, f"{duration:.2f}", zip_path
432
+
433
+ # Generation function for flux.1-krea
434
+ @spaces.GPU
435
+ def generate_krea(
436
+ prompt: str,
437
+ seed: int = 0,
438
+ width: int = 1024,
439
+ height: int = 1024,
440
+ guidance_scale: float = 4.5,
441
+ randomize_seed: bool = False,
442
+ num_inference_steps: int = 28,
443
+ num_images: int = 1,
444
+ zip_images: bool = False,
445
+ progress=gr.Progress(track_tqdm=True),
446
+ ):
447
+ if randomize_seed:
448
+ seed = random.randint(0, MAX_SEED)
449
+ generator = torch.Generator().manual_seed(seed)
450
+
451
+ start_time = time.time()
452
+
453
+ images = []
454
+ for _ in range(num_images):
455
+ final_img = list(pipe_krea.flux_pipe_call_that_returns_an_iterable_of_images(
456
+ prompt=prompt,
457
+ guidance_scale=guidance_scale,
458
+ num_inference_steps=num_inference_steps,
459
+ width=width,
460
+ height=height,
461
+ generator=generator,
462
+ output_type="pil",
463
+ good_vae=good_vae,
464
+ ))[-1] # Take the final image only
465
+ images.append(final_img)
466
+
467
+ end_time = time.time()
468
+ duration = end_time - start_time
469
+
470
+ image_paths = [save_image(img) for img in images]
471
+
472
+ zip_path = None
473
+ if zip_images:
474
+ zip_name = str(uuid.uuid4()) + ".zip"
475
+ with zipfile.ZipFile(zip_name, 'w') as zipf:
476
+ for i, img_path in enumerate(image_paths):
477
+ zipf.write(img_path, arcname=f"Img_{i}.png")
478
+ zip_path = zip_name
479
+
480
+ return image_paths, seed, f"{duration:.2f}", zip_path
481
+
482
+ # Main generation function to handle model choice
483
+ @spaces.GPU
484
+ def generate(
485
+ model_choice: str,
486
+ prompt: str,
487
+ negative_prompt: str = "",
488
+ use_negative_prompt: bool = False,
489
+ seed: int = 0,
490
+ width: int = 1024,
491
+ height: int = 1024,
492
+ guidance_scale: float = 3,
493
+ randomize_seed: bool = False,
494
+ style_name: str = DEFAULT_STYLE_NAME,
495
+ num_inference_steps: int = 30,
496
+ num_images: int = 1,
497
+ zip_images: bool = False,
498
+ progress=gr.Progress(track_tqdm=True),
499
+ ):
500
+ if model_choice == "flux.1-dev-realism":
501
+ return generate_dev(
502
+ prompt=prompt,
503
+ negative_prompt=negative_prompt,
504
+ use_negative_prompt=use_negative_prompt,
505
+ seed=seed,
506
+ width=width,
507
+ height=height,
508
+ guidance_scale=guidance_scale,
509
+ randomize_seed=randomize_seed,
510
+ style_name=style_name,
511
+ num_inference_steps=num_inference_steps,
512
+ num_images=num_images,
513
+ zip_images=zip_images,
514
+ progress=progress,
515
+ )
516
+ elif model_choice == "flux.1-krea-merged-dev":
517
+ return generate_krea(
518
+ prompt=prompt,
519
+ seed=seed,
520
+ width=width,
521
+ height=height,
522
+ guidance_scale=guidance_scale,
523
+ randomize_seed=randomize_seed,
524
+ num_inference_steps=num_inference_steps,
525
+ num_images=num_images,
526
+ zip_images=zip_images,
527
+ progress=progress,
528
+ )
529
+ else:
530
+ raise ValueError("Invalid model choice")
531
+
532
+ # Examples (tailored for flux.1-dev-realism)
533
+ examples = [
534
+ "An attractive young woman with blue eyes lying face down on the bed, in the style of animated gifs, light white and light amber, jagged edges, the snapshot aesthetic, timeless beauty, goosepunk, sunrays shine upon it --no freckles --chaos 65 --ar 1:2 --profile yruxpc2 --stylize 750 --v 6.1",
535
+ "Headshot of handsome young man, wearing dark gray sweater with buttons and big shawl collar, brown hair and short beard, serious look on his face, black background, soft studio lighting, portrait photography --ar 85:128 --v 6.0 --style",
536
+ "Purple Dreamy, a medium-angle shot of a young woman with long brown hair, wearing a pair of eye-level glasses, stands in front of a backdrop of purple and white lights.",
537
+ "High-resolution photograph, woman, UHD, photorealistic, shot on a Sony A7III --chaos 20 --ar 1:2 --style raw --stylize 250"
538
+ ]
539
+
540
+ css = '''
541
+ .gradio-container {
542
+ max-width: 590px !important;
543
+ margin: 0 auto !important;
544
+ }
545
+ h1 {
546
+ text-align: center;
547
+ }
548
+ footer {
549
+ visibility: hidden;
550
+ }
551
+ '''
552
+
553
+ # Gradio interface
554
+ with gr.Blocks(css=css, theme="bethecloud/storj_theme") as demo:
555
+ gr.Markdown(DESCRIPTION)
556
+ with gr.Row():
557
+ prompt = gr.Text(
558
+ label="Prompt",
559
+ show_label=False,
560
+ max_lines=1,
561
+ placeholder="Enter your prompt",
562
+ container=False,
563
+ )
564
+ run_button = gr.Button("Run", scale=0, variant="primary")
565
+ result = gr.Gallery(label="Result", columns=1, show_label=False, preview=True)
566
+
567
+ with gr.Row():
568
+ # Model choice radio button above additional options
569
+ model_choice = gr.Radio(
570
+ choices=["flux.1-krea-merged-dev", "flux.1-dev-realism"],
571
+ label="Select Model",
572
+ value="flux.1-krea-merged-dev"
573
+ )
574
+
575
+ with gr.Accordion("Additional Options", open=False):
576
+ style_selection = gr.Dropdown(
577
+ label="Quality Style (for flux.1-dev-realism only)",
578
+ choices=STYLE_NAMES,
579
+ value=DEFAULT_STYLE_NAME,
580
+ interactive=True,
581
+ )
582
+ use_negative_prompt = gr.Checkbox(label="Use negative prompt (for flux.1-dev-realism only)", value=False)
583
+ negative_prompt = gr.Text(
584
+ label="Negative prompt",
585
+ max_lines=1,
586
+ placeholder="Enter a negative prompt",
587
+ visible=False,
588
+ )
589
+ seed = gr.Slider(
590
+ label="Seed",
591
+ minimum=0,
592
+ maximum=MAX_SEED,
593
+ step=1,
594
+ value=0,
595
+ )
596
+ randomize_seed = gr.Checkbox(label="Randomize seed", value=True)
597
+ with gr.Row():
598
+ width = gr.Slider(
599
+ label="Width",
600
+ minimum=512,
601
+ maximum=2048,
602
+ step=64,
603
+ value=1024,
604
+ )
605
+ height = gr.Slider(
606
+ label="Height",
607
+ minimum=512,
608
+ maximum=2048,
609
+ step=64,
610
+ value=1024,
611
+ )
612
+ guidance_scale = gr.Slider(
613
+ label="Guidance Scale",
614
+ minimum=0.1,
615
+ maximum=20.0,
616
+ step=0.1,
617
+ value=3.5,
618
+ )
619
+ num_inference_steps = gr.Slider(
620
+ label="Number of inference steps",
621
+ minimum=1,
622
+ maximum=40,
623
+ step=1,
624
+ value=28,
625
+ )
626
+ num_images = gr.Slider(
627
+ label="Number of images",
628
+ minimum=1,
629
+ maximum=5,
630
+ step=1,
631
+ value=1,
632
+ )
633
+ zip_images = gr.Checkbox(label="Zip generated images", value=False)
634
+
635
+ gr.Markdown("### Output Information")
636
+ seed_display = gr.Textbox(label="Seed used", interactive=False)
637
+ generation_time = gr.Textbox(label="Generation time (seconds)", interactive=False)
638
+ zip_file = gr.File(label="Download ZIP")
639
+
640
+ gr.Examples(
641
+ examples=examples,
642
+ inputs=prompt,
643
+ outputs=[result, seed_display, generation_time, zip_file],
644
+ fn=generate,
645
+ cache_examples=False,
646
+ )
647
+
648
+ use_negative_prompt.change(
649
+ fn=lambda x: gr.update(visible=x),
650
+ inputs=use_negative_prompt,
651
+ outputs=negative_prompt,
652
+ api_name=False,
653
+ )
654
+
655
+ gr.on(
656
+ triggers=[
657
+ prompt.submit,
658
+ run_button.click,
659
+ ],
660
+ fn=generate,
661
+ inputs=[
662
+ model_choice,
663
+ prompt,
664
+ negative_prompt,
665
+ use_negative_prompt,
666
+ seed,
667
+ width,
668
+ height,
669
+ guidance_scale,
670
+ randomize_seed,
671
+ style_selection,
672
+ num_inference_steps,
673
+ num_images,
674
+ zip_images,
675
+ ],
676
+ outputs=[result, seed_display, generation_time, zip_file],
677
+ api_name="run",
678
+ )
679
+
680
+ if __name__ == "__main__":
681
+ demo.queue(max_size=30).launch(mcp_server=True, ssr_mode=False, show_error=True)
682
+ ```
683
+
684
+ </details>
685
+
686
+ ---
687
+
688
  ## For more information, visit the documentation.
689
 
690
  > Flux is a suite of state-of-the-art text-to-image generation models based on diffusion transformers, developed by Black Forest Labs. The models are designed for high-quality generative image tasks, including text-to-image, inpainting, outpainting, and advanced structure or depth-controlled workflows. Flux is available through the Hugging Face diffusers library.