4 9 4

Zili Wang

MarkWang

MarkXCloud

AI & ML interests

Multi-modality learning and inference acceleration

Recent Activity

upvoted a paper 8 days ago

Taming Modality Entanglement in Continual Audio-Visual Segmentation

upvoted a paper 14 days ago

Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering

upvoted a paper 2 months ago

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

View all activity

Organizations

upvoted a paper 8 days ago

Taming Modality Entanglement in Continual Audio-Visual Segmentation

Paper • 2510.17234 • Published 15 days ago • 3

upvoted a paper 14 days ago

Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering

Paper • 2510.14605 • Published 19 days ago • 3

upvoted a paper 2 months ago

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published Aug 28 • 109

upvoted a paper 3 months ago

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Paper • 2508.10711 • Published Aug 14 • 142

liked a model 3 months ago

YannQi/R-4B

Image-Text-to-Text • 5B • Updated Sep 4 • 50.4k • 169

upvoted a paper 5 months ago

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28 • 43

authored a paper 5 months ago

Faster and Better LLMs via Latency-Aware Test-Time Scaling

Paper • 2505.19634 • Published May 26

upvoted an article 5 months ago

Article

Vision Language Models (Better, Faster, Stronger)

May 12

• 558

authored a paper 12 months ago

Continuous Speculative Decoding for Autoregressive Image Generation

Paper • 2411.11925 • Published Nov 18, 2024 • 16

upvoted a paper 12 months ago

Continuous Speculative Decoding for Autoregressive Image Generation

Paper • 2411.11925 • Published Nov 18, 2024 • 16

commented a paper 12 months ago

Continuous Speculative Decoding for Autoregressive Image Generation

Paper • 2411.11925 • Published Nov 18, 2024 • 16 •

authored 3 papers about 1 year ago

Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis

Paper • 2409.06135 • Published Sep 10, 2024 • 16

Layerwise Recurrent Router for Mixture-of-Experts

Paper • 2408.06793 • Published Aug 13, 2024 • 32

AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation

Paper • 2408.01708 • Published Aug 3, 2024 • 4

commented a paper about 1 year ago

AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation

Paper • 2408.01708 • Published Aug 3, 2024 • 4 •

authored 3 papers over 1 year ago

liked 2 Spaces about 2 years ago

Detection Metrics

📈

Compute object detection metrics using COCO style

170

Open Object Detection Leaderboard

🏆

Request evaluation for a new model

Zili Wang

AI & ML interests

Recent Activity

Organizations

MarkWang's activity

Vision Language Models (Better, Faster, Stronger)

Detection Metrics

Open Object Detection Leaderboard