Weiyun Wang's picture

Weiyun Wang

Weiyun1025

·

Weiyun1025

AI & ML interests

None yet

Recent Activity

upvoted a paper 17 days ago

ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution

upvoted a paper 20 days ago

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

upvoted a paper 20 days ago

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

View all activity

Organizations

upvoted a paper 17 days ago

ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution

Paper • 2510.12793 • Published 19 days ago • 2

upvoted 2 papers 20 days ago

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

Paper • 2510.11341 • Published 20 days ago • 33

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

Paper • 2510.11027 • Published 21 days ago • 20

upvoted a paper 24 days ago

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

Paper • 2510.08565 • Published 24 days ago • 19

upvoted a paper about 1 month ago

Sequential Diffusion Language Models

Paper • 2509.24007 • Published Sep 28 • 42

upvoted a collection about 1 month ago

InternVL3.5-Flash

InternVL3.5-Flash is a fast variant of InternVL3.5 using semantic aware dynamic high-resolution strategy. • 9 items • Updated 19 days ago • 6

upvoted 3 papers about 1 month ago

Reinforcement Learning on Pre-Training Data

Paper • 2509.19249 • Published Sep 23 • 67

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16 • 49

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Paper • 2509.15221 • Published Sep 18 • 109

upvoted a paper about 2 months ago

GenExam: A Multidisciplinary Text-to-Image Exam

Paper • 2509.14232 • Published Sep 17 • 21

upvoted a paper 2 months ago

Kwai Keye-VL 1.5 Technical Report

Paper • 2509.01563 • Published Sep 1 • 36

upvoted a collection 2 months ago

InternVL3.5-Core

This collection includes only the InternVL3.5 checkpoints that have completed the full training pipeline (i.e., Pretraining, SFT, MPO, Cascade RL). • 30 items • Updated Sep 28 • 12

upvoted a paper 2 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 202

upvoted 2 collections 2 months ago

InternVL3.5

This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 54 items • Updated Sep 28 • 101

InternVL3.5

33 items • Updated Aug 29 • 5

upvoted a paper 2 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 255

upvoted 2 papers 3 months ago

Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4 • 259

Mono-InternVL-1.5: Towards Cheaper and Faster Monolithic Multimodal Large Language Models

Paper • 2507.12566 • Published Jul 16 • 14

upvoted 2 papers 4 months ago

AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning

Paper • 2507.12841 • Published Jul 17 • 41

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

Paper • 2506.23918 • Published Jun 30 • 88