Shengyi Qian's picture

2 2 3

Shengyi Qian

shengyi-qian

·

https://jasonqsy.github.io/

AI & ML interests

AI Agents, Vision Language Model

Organizations

authored a paper 2 months ago

DigiData: Training and Evaluating General-Purpose Mobile Control Agents

Paper • 2511.07413 • Published Nov 10, 2025 • 5

authored 2 papers over 1 year ago

Multi-Object Hallucination in Vision-Language Models

Paper • 2407.06192 • Published Jul 8, 2024 • 12

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

Paper • 2406.05132 • Published Jun 7, 2024 • 30

authored 4 papers over 2 years ago

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

Paper • 2309.12311 • Published Sep 21, 2023 • 17

Understanding 3D Object Articulation in Internet Videos

Paper • 2203.16531 • Published Mar 30, 2022

Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation

Paper • 2303.11329 • Published Mar 20, 2023 • 1

Understanding 3D Object Interaction from a Single Image

Paper • 2305.09664 • Published May 16, 2023 • 2