Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
johannhartmann
's Collections
Music
Computer Use Models
Document & UI Intelligence
Multimodal Models
Medical MultiModal
Computer Use Models
updated
Nov 25
Upvote
1
ByteDance-Seed/UI-TARS-72B-DPO
Image-Text-to-Text
•
73B
•
Updated
Jan 25
•
2.11k
•
148
ByteDance-Seed/UI-TARS-7B-DPO
Image-Text-to-Text
•
8B
•
Updated
Jan 25
•
1.44k
•
222
microsoft/OmniParser
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
410
•
1.7k
jadechoghari/Ferret-UI-Llama8b
Image-Text-to-Text
•
8B
•
Updated
Jan 8
•
269
•
68
microsoft/GUI-Actor-7B-Qwen2.5-VL
Image-Text-to-Text
•
8B
•
Updated
Aug 9
•
541
•
24
showlab/ShowUI-2B
Updated
Mar 11
•
2.61k
•
269
Zery/CUA_World_State_Model
Image-Text-to-Text
•
Updated
Aug 7
•
11
•
4
microsoft/Fara-7B
Image-Text-to-Text
•
8B
•
Updated
16 days ago
•
273k
•
451
Qwen/Qwen2.5-Omni-7B
Any-to-Any
•
11B
•
Updated
Apr 30
•
164k
•
1.83k
Hcompany/Holo2-30B-A3B
Image-Text-to-Text
•
31B
•
Updated
Nov 21
•
466
•
36
Hcompany/Holo2-4B
Image-Text-to-Text
•
4B
•
Updated
Nov 14
•
1.58k
•
16
Hcompany/Holo2-8B
Image-Text-to-Text
•
9B
•
Updated
Nov 14
•
4.61k
•
16
AskUI/PTA-1
Image-Text-to-Text
•
0.3B
•
Updated
Nov 28, 2024
•
703
•
98
OS-Copilot/OS-Atlas-Base-7B
Image-Text-to-Text
•
8B
•
Updated
Nov 19, 2024
•
465
•
42
Qwen/Qwen2-VL-7B-Instruct
Image-Text-to-Text
•
8B
•
Updated
Feb 6
•
1.23M
•
•
1.25k
xlangai/OpenCUA-72B
Image-Text-to-Text
•
73B
•
Updated
Nov 11
•
668
•
5
xlangai/OpenCUA-32B
Image-Text-to-Text
•
33B
•
Updated
Aug 18
•
637
•
25
xlangai/OpenCUA-7B
Image-Text-to-Text
•
8B
•
Updated
Nov 13
•
53.6k
•
21
xlangai/Jedi-7B-1080p
Image-Text-to-Text
•
8B
•
Updated
Jun 18
•
66
•
29
xlangai/Jedi-3B-1080p
Image-Text-to-Text
•
4B
•
Updated
Jun 18
•
106
•
17
Qwen/Qwen3-VL-8B-Instruct
Image-Text-to-Text
•
9B
•
Updated
Oct 15
•
2.73M
•
•
604
Qwen/Qwen3-VL-8B-Thinking
Image-Text-to-Text
•
9B
•
Updated
Nov 26
•
158k
•
163
Upvote
1
Share collection
View history
Collection guide
Browse collections