2 74 21

Lewei Lu

luotto

ottolu

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

upvoted a paper 7 days ago

VR-Thinker: Boosting Video Reward Models through Thinking-with-Image Reasoning

upvoted a paper 7 days ago

Agent Learning via Early Experience

View all activity

Organizations

upvoted a paper 1 day ago

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

Paper • 2510.11027 • Published 13 days ago • 19

upvoted 2 papers 7 days ago

VR-Thinker: Boosting Video Reward Models through Thinking-with-Image Reasoning

Paper • 2510.10518 • Published 14 days ago • 17

Agent Learning via Early Experience

Paper • 2510.08558 • Published 16 days ago • 240

upvoted a paper 8 days ago

Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published 12 days ago • 157

upvoted a paper 9 days ago

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

Paper • 2510.14979 • Published 9 days ago • 64

upvoted 2 papers 10 days ago

CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving

Paper • 2510.07944 • Published 17 days ago • 24

InteractiveOmni: A Unified Omni-modal Model for Audio-Visual Multi-turn Dialogue

Paper • 2510.13747 • Published 10 days ago • 28

upvoted a paper 16 days ago

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

Paper • 2510.08565 • Published 16 days ago • 19

upvoted a paper 17 days ago

Paper2Video: Automatic Video Generation from Scientific Papers

Paper • 2510.05096 • Published 19 days ago • 106

upvoted 2 papers 28 days ago

BaseReward: A Strong Baseline for Multimodal Reward Model

Paper • 2509.16127 • Published Sep 19 • 21

LIMI: Less is More for Agency

Paper • 2509.17567 • Published Sep 22 • 100

upvoted an article 30 days ago

Article

Gaia2 and ARE: Empowering the community to study agents

Sep 22

• 115

upvoted 3 papers about 1 month ago

upvoted 3 papers about 2 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 201

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4 • 189

ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding

Paper • 2508.21496 • Published Aug 29 • 54

upvoted 2 papers 2 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 254

Has GPT-5 Achieved Spatial Intelligence? An Empirical Study

Paper • 2508.13142 • Published Aug 18 • 34

Lewei Lu

AI & ML interests

Recent Activity

Organizations

luotto's activity

Gaia2 and ARE: Empowering the community to study agents