Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning
AI & ML interests
Computer Vision
Recent Activity
View all activity
Papers

ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution

ExpVid: A Benchmark for Experiment Video Understanding & Reasoning
Organization Card
OpenGVLab
Welcome to OpenGVLab! We are a research group from Shanghai AI Lab focused on Vision-Centric AI research. The GV in our name, OpenGVLab, means general vision, a general understanding of vision, so little effort is needed to adapt to new vision-based tasks.
Models
- InternVL: a pioneering open-source alternative to GPT-4V.
- InternImage: a large-scale vision foundation models with deformable convolutions.
- InternVideo: large-scale video foundation models for multimodal understanding.
- VideoChat: an end-to-end chat assistant for video comprehension.
- All-Seeing-Project: towards panoptic visual recognition and understanding of the open world.
Datasets
- ShareGPT4o: a groundbreaking large-scale resource that we plan to open-source with 200K meticulously annotated images, 10K videos with highly descriptive captions, and 10K audio files with detailed descriptions.
- InternVid: a large-scale video-text dataset for multimodal understanding and generation.
- MMPR: a high-quality, large-scale multimodal preference dataset.
Benchmarks
- MVBench: a comprehensive benchmark for multimodal video understanding.
- CRPE: a benchmark covering all elements of the relation triplets (subject, predicate, object), providing a systematic platform for the evaluation of relation comprehension ability.
- MM-NIAH: a comprehensive benchmark for long multimodal documents comprehension.
- GMAI-MMBench: a comprehensive multimodal evaluation benchmark towards general medical AI.
spaces
13
Running
4
ScaleCUA Demo
📚
Display web content in a Streamlit app
Runtime error
InternVideo2.5
💬
Hierarchical Compression for Long-Context Video Modeling
Running
500
InternVL
⚡
Interact with a multimodal chatbot that analyzes images and text
Running
40
MVBench Leaderboard
🐨
Submit and view model evaluations
Runtime error
18
InternVideo2 Chat 8B HD
👁
Upload a video to chat about its contents
models
286

OpenGVLab/Vlaser-2B-VLA
Updated
•
3

OpenGVLab/Vlaser-8B
8B
•
Updated
•
69
•
2

OpenGVLab/Vlaser-2B
2B
•
Updated
•
46
•
1

OpenGVLab/VeBrain
8B
•
Updated
•
96

OpenGVLab/NaViL-9B
16B
•
Updated
•
48

OpenGVLab/NaViL-2B
4B
•
Updated
•
52

OpenGVLab/SDLM-32B-D4
Text Generation
•
33B
•
Updated
•
415
•
11

OpenGVLab/SDLM-3B-D8
Text Generation
•
3B
•
Updated
•
375
•
3

OpenGVLab/SDLM-3B-D4
Text Generation
•
3B
•
Updated
•
387
•
4

OpenGVLab/VideoChat-R1_5-7B
Video-Text-to-Text
•
8B
•
Updated
•
1.21k
•
7
datasets
49
OpenGVLab/ExpVid
Preview
•
Updated
•
657
•
4
OpenGVLab/GenExam
Updated
•
362
•
3
OpenGVLab/ScaleCUA-Data
Preview
•
Updated
•
7.83k
•
22
OpenGVLab/VRBench
Preview
•
Updated
•
237
•
4
OpenGVLab/MMPR-v1.2
Updated
•
7.31k
•
36
OpenGVLab/MMPR-Tiny
Updated
•
574
•
6
OpenGVLab/MMPR-v1.2-prompts
Updated
•
3.98k
•
2
OpenGVLab/MMBench-GUI
Preview
•
Updated
•
112
•
36
OpenGVLab/GUI-Odyssey
Viewer
•
Updated
•
7.74k
•
7k
•
25
OpenGVLab/LORIS
Updated
•
969
•
3