| pipeline_tag: image-text-to-text | |
| library_name: transformers | |
| # LLaDA-V | |
| We introduce LLaDA-V, a competitive diffusion-based vision-language model, outperforming other diffusion MLLMs. | |
| It was presented in the paper [LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning](https://huggingface.co/papers/2505.16933). | |
| Project Page: https://ml-gsai.github.io/LLaDA-V-demo/ | |
| Code: https://github.com/ML-GSAI/LLaDA-V |