This is the AWQ model of UI-TARS-1.5-7B, built using AutoAWQ on A100(80G), works with vllm and lmdeploy.

Model Description

UI-TARS-1.5-7B is an open-source multimodal agent model released by ByteDance. It achieves state-of-the-art results across a variety of standard benchmarks, demonstrating strong reasoning capabilities and notable improvements over prior models.

Code: https://github.com/bytedance/UI-TARS

Application: https://github.com/bytedance/UI-TARS-desktop

Grounding Capability Evaluation

Benchmark	UI-TARS-1.5	OpenAI CUA	Claude 3.7	Previous SOTA
ScreensSpot-V2	94.2	87.9	87.6	91.6
ScreenSpotPro	61.6	23.4	27.7	43.6

Model Scale Comparison

This table compares performance across different model scales of UI-TARS on the OSworld benchmark.

Benchmark Type	Benchmark	UI-TARS-72B-DPO	UI-TARS-1.5-7B	UI-TARS-1.5
Computer Use	OSWorld	24.6	27.5	42.5
GUI Grounding	ScreenSpotPro	38.1	49.6	61.6

The released UI-TARS-1.5-7B focuses primarily on enhancing general computer use capabilities and is not specifically optimized for game-based scenarios, where the UI-TARS-1.5 still holds a significant advantage.

Downloads last month: 178

Safetensors

Model size

3B params

Tensor type

I32

BF16

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for flin775/UI-TARS-1.5-7B-AWQ

Base model

ByteDance-Seed/UI-TARS-1.5-7B

Quantized

(12)

this model