zhangtao's picture

2 14 4

zhangtao

zhangtao-whu

·

https://github.com/zhang-tao-whu

zhang-tao-whu

AI & ML interests

segmentation

Recent Activity

upvoted a paper 11 days ago

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

published a dataset 12 days ago

zhangtao-whu/coconut

upvoted a paper 13 days ago

Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

View all activity

Organizations

liked a dataset about 2 months ago

LucasFang/FLUX-Reason-6M

Viewer • Updated Sep 12 • 5.89M • 10.3k • 81

liked a dataset 3 months ago

cyberalchemist/PixelWeb

Updated May 21 • 56 • 5

liked a model 10 months ago

microsoft/phi-4

Text Generation • 15B • Updated Feb 24 • 555k • 2.18k

liked a Space 10 months ago

Sa2VA Simple Demo

Dense Grounded Understanding of Images and Videos