9 148 161

Emanuele Vivoli

emanuelevivoli

https://www.emanuelevivoli.me

AI & ML interests

I work on Comics/Manga :)

Recent Activity

upvoted an article 5 days ago

Vision Language Models (Better, Faster, Stronger)

authored a paper 8 days ago

CoSMo: A Multimodal Transformer for Page Stream Segmentation in Comic Books

authored a paper 8 days ago

Multimodal Transformer for Comics Text-Cloze

View all activity

Organizations

upvoted an article 5 days ago

Article

Vision Language Models (Better, Faster, Stronger)

May 12

• 551

upvoted an article 8 days ago

Article

Preference Optimization for Vision Language Models

Jul 10, 2024

• 86

upvoted a collection 8 days ago

Qwen3-VL

Collection

25 items • Updated 2 days ago • 313

upvoted a paper about 2 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 254

upvoted an article 3 months ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

Aug 5

• 501

upvoted 3 papers 3 months ago

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26 • 155

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 306

Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Paper • 2506.18898 • Published Jun 23 • 33

upvoted 2 collections 3 months ago

Tar

Collection

[NeurIPS 2025] Unifying Visual Understanding and Generation via Text-Aligned Representations • 5 items • Updated Sep 20 • 16

Open LLM Leaderboard best models ❤️‍🔥

Collection

A daily uploaded list of models with best evaluations on the LLM leaderboard: • 65 items • Updated Mar 20 • 645

upvoted an article 4 months ago

Article

Efficient MultiModal Data Pipeline

Jul 8

• 57

upvoted a paper 4 months ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 267

upvoted 4 papers 5 months ago

Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation

Paper • 2505.18842 • Published May 24 • 36

Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

Paper • 2505.20256 • Published May 26 • 18

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Paper • 2505.09568 • Published May 14 • 97

upvoted 4 papers 6 months ago

Phi-4-reasoning Technical Report

Paper • 2504.21318 • Published Apr 30 • 52

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Paper • 2504.21233 • Published Apr 30 • 48

AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction

Paper • 2504.01014 • Published Apr 1 • 70

GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation

Paper • 2504.08736 • Published Apr 11 • 46

Emanuele Vivoli

AI & ML interests

Recent Activity

Organizations

emanuelevivoli's activity

Vision Language Models (Better, Faster, Stronger)

Preference Optimization for Vision Language Models

Welcome GPT OSS, the new open-source model family from OpenAI!

Efficient MultiModal Data Pipeline