Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Shengyi Qian's picture
2 2 3

Shengyi Qian

shengyi-qian
ankits0052's profile picture sihanxu's profile picture czyang's profile picture
·
https://jasonqsy.github.io/
  • JasonQSY
  • JasonQSY
  • jasonqsy

AI & ML interests

AI Agents, Vision Language Model

Organizations

University of Michigan's profile picture Situated Language and Embodied Dialogue Lab's profile picture MM-graph's profile picture Meta Llama's profile picture AI at Meta's profile picture Project of MoE reward model's profile picture

authored a paper 2 months ago

DigiData: Training and Evaluating General-Purpose Mobile Control Agents

Paper • 2511.07413 • Published Nov 10, 2025 • 5
authored 2 papers over 1 year ago

Multi-Object Hallucination in Vision-Language Models

Paper • 2407.06192 • Published Jul 8, 2024 • 12

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

Paper • 2406.05132 • Published Jun 7, 2024 • 30
authored 4 papers over 2 years ago

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

Paper • 2309.12311 • Published Sep 21, 2023 • 17

Understanding 3D Object Articulation in Internet Videos

Paper • 2203.16531 • Published Mar 30, 2022

Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation

Paper • 2303.11329 • Published Mar 20, 2023 • 1

Understanding 3D Object Interaction from a Single Image

Paper • 2305.09664 • Published May 16, 2023 • 2
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs