10 22 29

Mengzhao Chen

ChenMnZ

https://chenmnz.github.io/

ChenMnZ

AI & ML interests

model compression

Recent Activity

upvoted a paper 5 days ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

upvoted a paper 11 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

upvoted a paper 29 days ago

Seedream 4.0: Toward Next-generation Multimodal Image Generation

View all activity

Organizations

None yet

upvoted a paper 5 days ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published 7 days ago • 69

upvoted a paper 11 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published 11 days ago • 164

upvoted a paper 29 days ago

Seedream 4.0: Toward Next-generation Multimodal Image Generation

Paper • 2509.20427 • Published about 1 month ago • 74

upvoted an article 2 months ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

•

Oct 7, 2024

• 55

liked a dataset 2 months ago

togethercomputer/Long-Data-Collections

Viewer • Updated Jan 4 • 4.12M • 467 • 154

upvoted 2 papers 5 months ago

LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

Paper • 2505.16933 • Published May 22 • 34

Scaling Diffusion Transformers Efficiently via μP

Paper • 2505.15270 • Published May 21 • 35

authored 3 papers 5 months ago

upvoted a paper 5 months ago

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20 • 76

commented a paper 5 months ago

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20 • 76 •

upvoted a paper 5 months ago

Emerging Properties in Unified Multimodal Pretraining

Paper • 2505.14683 • Published May 20 • 133

commented a paper 5 months ago

Model Merging in Pre-training of Large Language Models

Paper • 2505.12082 • Published May 17 • 40 •

upvoted 2 papers 5 months ago

Model Merging in Pre-training of Large Language Models

Paper • 2505.12082 • Published May 17 • 40

DanceGRPO: Unleashing GRPO on Visual Generation

Paper • 2505.07818 • Published May 12 • 32

authored a paper 5 months ago

DanceGRPO: Unleashing GRPO on Visual Generation

Paper • 2505.07818 • Published May 12 • 32

liked a model 5 months ago

mlfoundations/scaling

Updated Mar 15, 2024 • 4

liked a model 8 months ago

nvidia/DeepSeek-R1-FP4

Text Generation • Updated Jun 6 • 8.97k • 265

upvoted a paper 8 months ago

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published Dec 16, 2024 • 58

Mengzhao Chen

AI & ML interests

Recent Activity

Organizations

ChenMnZ's activity

Efficient LLM Pretraining: Packed Sequences and Masked Attention