32 13 11

zwgao

AI & ML interests

None yet

Recent Activity

upvoted a paper 21 days ago

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

upvoted a paper 21 days ago

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

new activity about 1 month ago

OpenGVLab/InternVL3_5-14B:在5880显卡上无法成功推理模型

View all activity

Organizations

upvoted 2 papers 21 days ago

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

Paper • 2510.11027 • Published 22 days ago • 20

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

Paper • 2510.11341 • Published 22 days ago • 33

New activity in OpenGVLab/InternVL3_5-14B about 1 month ago

在5880显卡上无法成功推理模型

#3 opened about 2 months ago by

yuer2310

New activity in OpenGVLab/InternVL3_5-241B-A28B 2 months ago

Some errors when deploying the lmdeploy api_server

#2 opened 2 months ago by

banne2266

liked 2 models 2 months ago

OpenGVLab/InternVL3_5-241B-A28B

Image-Text-to-Text • 241B • Updated Aug 29 • 6.27k • 131

OpenGVLab/InternVL3_5-38B

Image-Text-to-Text • 38B • Updated Aug 29 • 12.3k • 33

upvoted a paper 2 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 202

liked a dataset 4 months ago

OpenGVLab/MMBench-GUI

Preview • Updated Aug 15 • 140 • 37

updated a model 5 months ago

zwgao/DriveLM_Submission_V_15

Updated Jun 5

published a model 5 months ago

zwgao/DriveLM_Submission_V_15

Updated Jun 5

upvoted a paper 5 months ago

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

Paper • 2506.00123 • Published May 30 • 35

updated a model 5 months ago

zwgao/DriveLM_Submission_V_16

Updated Jun 3

published a model 5 months ago

zwgao/DriveLM_Submission_V_16

Updated Jun 3

upvoted a paper 5 months ago

ZeroGUI: Automating Online GUI Learning at Zero Human Cost

Paper • 2505.23762 • Published May 29 • 45

upvoted a paper 6 months ago

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

Paper • 2504.15279 • Published Apr 21 • 77

upvoted 2 papers 7 months ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 300

Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy

Paper • 2503.19757 • Published Mar 25 • 51

updated 3 models 7 months ago