7 17 2

ChenyangSi

http://chenyangsi.top/

AI & ML interests

None yet

Recent Activity

authored a paper 10 days ago

CoS: Chain-of-Shot Prompting for Long Video Understanding

authored a paper 10 days ago

FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model

authored a paper 10 days ago

LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation

View all activity

Organizations

authored 7 papers 10 days ago

SpineBench: A Clinically Salient, Level-Aware Benchmark Powered by the SpineMed-450k Corpus

Paper • 2510.03160 • Published Oct 3 • 4

RealDPO: Real or Not Real, that is the Preference

Paper • 2510.14955 • Published Oct 16 • 6

DiverseAR: Boosting Diversity in Bitwise Autoregressive Image Generation

Paper • 2512.02931 • Published 23 days ago

LongVie 2: Multimodal Controllable Ultra-Long Video World Model

Paper • 2512.13604 • Published 10 days ago • 70

authored a paper 21 days ago

PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design

Paper • 2512.04082 • Published 22 days ago • 13

authored 12 papers 7 months ago

VBench: Comprehensive Benchmark Suite for Video Generative Models

Paper • 2311.17982 • Published Nov 29, 2023 • 9

FSAR: Federated Skeleton-based Action Recognition with Adaptive Topology Structure and Knowledge Distillation

Paper • 2306.11046 • Published Jun 19, 2023

Towards Language-Driven Video Inpainting via Multimodal Large Language Models

Paper • 2401.10226 • Published Jan 18, 2024 • 1

MetaFormer Is Actually What You Need for Vision

Paper • 2111.11418 • Published Nov 22, 2021 • 1

Mugs: A Multi-Granular Self-Supervised Learning Framework

Paper • 2203.14415 • Published Mar 27, 2022

Scaling Supervised Local Learning with Augmented Auxiliary Networks

Paper • 2402.17318 • Published Feb 27, 2024

MetaFormer Baselines for Vision

Paper • 2210.13452 • Published Oct 24, 2022

VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models

Paper • 2411.13503 • Published Nov 20, 2024 • 34

Inception Transformer

Paper • 2205.12956 • Published May 25, 2022

Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT

Paper • 2502.06782 • Published Feb 10 • 14

Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models

Paper • 2501.08453 • Published Jan 14 • 1

IVY-FAKE: A Unified Explainable Framework and Benchmark for Image and Video AIGC Detection

Paper • 2506.00979 • Published Jun 1 • 13

ChenyangSi

AI & ML interests

Recent Activity

Organizations

ChenyangSi's activity