Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Giuliano 's Collections
Agents 2.0
Multimodal
Voice
Video Gen
text2sql
Medicine
LLM Personalization
Agents
Agents SWE
Agents GUI
LLM Reasoning

Agents GUI

updated Feb 16
Upvote
-

  • ShowUI: One Vision-Language-Action Model for GUI Visual Agent

    Paper • 2411.17465 • Published Nov 26, 2024 • 88

  • OmniParser for Pure Vision Based GUI Agent

    Paper • 2408.00203 • Published Aug 1, 2024 • 25

  • Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

    Paper • 2412.04454 • Published Dec 5, 2024 • 70

  • zai-org/cogagent-9b-20241220

    Image-Text-to-Text • 14B • Updated Dec 25, 2024 • 381 • 53

  • CogAgent: A Visual Language Model for GUI Agents

    Paper • 2312.08914 • Published Dec 14, 2023 • 31

  • Runtime error
    7
    7

    CogAgent Demo

    🏃

    CogAgent-GUI-Demo


  • A3: Android Agent Arena for Mobile GUI Agents

    Paper • 2501.01149 • Published Jan 2 • 22

  • xlangai/Aguvis-7B-720P

    8B • Updated Jan 7 • 30 • 9

  • OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

    Paper • 2412.19723 • Published Dec 27, 2024 • 87

  • Running
    33
    33

    UI-TARS

    🌖

    Find click coordinates on images based on instructions


  • microsoft/OmniParser-v2.0

    Updated Mar 28 • 34.9k • 1.3k
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs