--- license: apache-2.0 --- --- license: apache-2.0 --- # FastVideo FastWan2.1-T2V-14B-480P-Diffusers

FastVideo Team
Paper | Github
## Model Overview - This model is jointly finetuned with [DMD](https://arxiv.org/pdf/2405.14867) and [VSA](https://arxiv.org/pdf/2505.13389), based on [Wan-AI/Wan2.1-T2V-14B-Diffusers](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B-Diffusers). - It supports 3-step inference and achieves up to 50x speed up. - Both [finetuning](https://github.com/hao-ai-lab/FastVideo/blob/main/scripts/distill/v1_distill_dmd_wan_VSA.sh) and [inference](https://github.com/hao-ai-lab/FastVideo/blob/main/scripts/inference/v1_inference_wan_dmd.sh) scripts are available in the [FastVideo](https://github.com/hao-ai-lab/FastVideo) repository. - Try it out on **FastVideo** — we support a wide range of GPUs from **H100** to **4090**, and even support **Mac** users! - We use [FastVideo 480P Synthetic Wan dataset](https://huggingface.co/datasets/FastVideo/Wan-Syn_77x448x832_600k) for training. If you use FastWan2.1-T2V-14B-480P-Diffusers model for your research, please cite our paper: ``` @article{zhang2025vsa, title={VSA: Faster Video Diffusion with Trainable Sparse Attention}, author={Zhang, Peiyuan and Huang, Haofeng and Chen, Yongqi and Lin, Will and Liu, Zhengzhong and Stoica, Ion and Xing, Eric and Zhang, Hao}, journal={arXiv preprint arXiv:2505.13389}, year={2025} } @article{zhang2025fast, title={Fast video generation with sliding tile attention}, author={Zhang, Peiyuan and Chen, Yongqi and Su, Runlong and Ding, Hangliang and Stoica, Ion and Liu, Zhengzhong and Zhang, Hao}, journal={arXiv preprint arXiv:2502.04507}, year={2025} } ```