Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
OpenGVLab 's Collections
Vlaser
NaViL
InternVL3.5-Flash
InternVL3.5-Core
InternVL3.5
ScaleCUA
SDLM
Docopilot
ZeroGUI
InternVL3
VisualPRM
Mono-InternVL
PIIP
VideoChat-R1
InternVideo2.5
VideoMAE-v2
VideoChat-Flash
InternVL2.5
InternVL2.5-MPO
InternVL2.0
InternVL1.5
InternVL1.0
V2PE
InternVL Adaptation
InternVideo2
VideoChat
VideoMamba
InternVid
OmniCorpus
All-Seeing Project
InternImage
PVT v2
InternVL Data

Mono-InternVL

updated 28 days ago

A Pioneering Monolithic MLLM

Upvote
7

  • Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training

    Paper • 2410.08202 • Published Oct 10, 2024 • 4

    Note CVPR 2025


  • Mono-InternVL-1.5: Towards Cheaper and Faster Monolithic Multimodal Large Language Models

    Paper • 2507.12566 • Published Jul 16 • 14

  • OpenGVLab/Mono-InternVL-2B

    Image-Text-to-Text • 3B • Updated Jul 22 • 8.35k • 36

  • OpenGVLab/Mono-InternVL-2B-S1-1

    Image-Text-to-Text • 3B • Updated Jul 22 • 56

  • OpenGVLab/Mono-InternVL-2B-S1-2

    Image-Text-to-Text • 3B • Updated Jul 22 • 51 • 1

  • OpenGVLab/Mono-InternVL-2B-S1-3

    Image-Text-to-Text • 3B • Updated Jul 22 • 65 • 1

  • OpenGVLab/Mono-InternVL-2B-Synthetic-Data

    Viewer • Updated Jul 22 • 3.05k • 39 • 2
Upvote
7
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs