Skywork
/

SkyReels-V2-I2V-1.3B-540P-Diffusers

Image-to-Video

Diffusers

Safetensors

English

SkyReelsV2ImageToVideoPipeline

video

video generation

Model card Files Files and versions

xet

Community

howe

tolgacangoz commited on Aug 11

Commit

fd853c3

verified ·

1 Parent(s): 2ebddae

Update README.md (#3)

Browse files

- Update README.md (9d26f88f5e293f98f8aabd78475088f62a67e60f)

Co-authored-by: Tolga Cangöz <[email protected]>

Files changed (1) hide show

README.md +66 -1

README.md CHANGED Viewed

@@ -3,6 +3,12 @@ license: other
 license_name: skywork-license
 license_link: LICENSE
 pipeline_tag: image-to-video
 ---
 <p align="center">
   <img src="assets/logo2.png" alt="SkyReels Logo" width="50%">
@@ -49,7 +55,7 @@ The demos above showcase 30-second videos generated using our SkyReels-V2 Diffus
 - [x] Single-GPU & Multi-GPU Inference Code
 - [x] <a href="https://huggingface.co/Skywork/SkyCaptioner-V1">SkyCaptioner-V1</a>: A Video Captioning Model
 - [x] Prompt Enhancer
-- [ ] Diffusers integration
 - [ ] Checkpoints of the 5B Models Series
 - [ ] Checkpoints of the Camera Director Models
 - [ ] Checkpoints of the Step & Guidance Distill Model
@@ -57,6 +63,65 @@ The demos above showcase 30-second videos generated using our SkyReels-V2 Diffus
 ## 🚀 Quickstart
 #### Installation
 ```shell
 # clone the repository.

 license_name: skywork-license
 license_link: LICENSE
 pipeline_tag: image-to-video
+library_name: diffusers
+tags:
+- video
+- video generation
+language:
+- en
 ---
 <p align="center">
   <img src="assets/logo2.png" alt="SkyReels Logo" width="50%">
 - [x] Single-GPU & Multi-GPU Inference Code
 - [x] <a href="https://huggingface.co/Skywork/SkyCaptioner-V1">SkyCaptioner-V1</a>: A Video Captioning Model
 - [x] Prompt Enhancer
+- [x] Diffusers integration
 - [ ] Checkpoints of the 5B Models Series
 - [ ] Checkpoints of the Camera Director Models
 - [ ] Checkpoints of the Step & Guidance Distill Model
 ## 🚀 Quickstart
+Wan can run directly using 🤗 Diffusers!
+```py
+# pip install ftfy
+import numpy as np
+import torch
+from diffusers import AutoModel, SkyReelsV2ImageToVideoPipeline, UniPCMultistepScheduler
+from diffusers.utils import export_to_video, load_image
+model_id = "Skywork/SkyReels-V2-I2V-1.3B-540P-Diffusers"
+vae = AutoModel.from_pretrained(model_id, subfolder="vae", torch_dtype=torch.float32)
+pipeline = SkyReelsV2ImageToVideoPipeline.from_pretrained(
+    model_id,
+    vae=vae,
+    torch_dtype=torch.bfloat16
+)
+flow_shift = 8.0  # 8.0 for T2V, 5.0 for I2V
+pipeline.scheduler = UniPCMultistepScheduler.from_config(pipeline.scheduler.config, flow_shift=flow_shift)
+pipeline = pipeline.to("cuda")
+first_frame = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/flf2v_input_first_frame.png")
+def aspect_ratio_resize(image, pipeline, max_area=720 * 1280):
+    aspect_ratio = image.height / image.width
+    mod_value = pipeline.vae_scale_factor_spatial * pipeline.transformer.config.patch_size[1]
+    height = round(np.sqrt(max_area * aspect_ratio)) // mod_value * mod_value
+    width = round(np.sqrt(max_area / aspect_ratio)) // mod_value * mod_value
+    image = image.resize((width, height))
+    return image, height, width
+def center_crop_resize(image, height, width):
+    # Calculate resize ratio to match first frame dimensions
+    resize_ratio = max(width / image.width, height / image.height)
+    # Resize the image
+    width = round(image.width * resize_ratio)
+    height = round(image.height * resize_ratio)
+    size = [width, height]
+    image = TF.center_crop(image, size)
+    return image, height, width
+first_frame, height, width = aspect_ratio_resize(first_frame, pipeline)
+prompt = "CG animation style, a small blue bird takes off from the ground, flapping its wings. The bird's feathers are delicate, with a unique pattern on its chest. The background shows a blue sky with white clouds under bright sunshine. The camera follows the bird upward, capturing its flight and the vastness of the sky from a close-up, low-angle perspective."
+output = pipeline(
+    image=first_frame,
+    guidance_scale=5.0
+    prompt=prompt,
+    num_inference_steps=50,
+    height=544,  # 720 for 720P
+    width=960,   # 1280 for 720P
+    num_frames=97,
+).frames[0]
+export_to_video(output, "video.mp4", fps=24, quality=8)
+```
 #### Installation
 ```shell
 # clone the repository.