Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
Zhang Xingjian
Zhang199
AI & ML interests
Large Multimodal Models
Organizations
None yet
TinyLLaVA-Video-R1
Towards Smaller LMMs for Video Reasoning.
-
Zhang199/TinyLLaVA-Video-R1
Video-Text-to-Text • 4B • Updated • 68 • 4 -
Zhang199/TinyLLaVA-Video-Coldstart_NextQA_16
Video-Text-to-Text • 4B • Updated • 3 • 1 -
Zhang199/TinyLLaVA-Video-R1-training-data
Updated • 41 • 1 -
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
Paper • 2504.09641 • Published • 16
EDGE-GRPO
Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
TinyLLaVA-Video-R1
Towards Smaller LMMs for Video Reasoning.
-
Zhang199/TinyLLaVA-Video-R1
Video-Text-to-Text • 4B • Updated • 68 • 4 -
Zhang199/TinyLLaVA-Video-Coldstart_NextQA_16
Video-Text-to-Text • 4B • Updated • 3 • 1 -
Zhang199/TinyLLaVA-Video-R1-training-data
Updated • 41 • 1 -
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
Paper • 2504.09641 • Published • 16
models
11
Zhang199/TinyLLaVA-Qwen2-0.5B-SigLIP
Image-Text-to-Text
•
1B
•
Updated
•
488
•
5
Zhang199/EDGE-GRPO-Qwen-1.5B
Text Generation
•
2B
•
Updated
•
5
Zhang199/EDGE-GRPO-Qwen-7B
Text Generation
•
8B
•
Updated
•
3
Zhang199/TinyLLaVA-Video-Qwen2.5-3B-Group-16-512
Video-Text-to-Text
•
4B
•
Updated
•
353
•
1
Zhang199/TinyLLaVA-Video-Qwen2.5-3B-Naive-16-512
Video-Text-to-Text
•
4B
•
Updated
•
41
Zhang199/TinyLLaVA-Video-Phi2-Naive-16-512
Video-Text-to-Text
•
3B
•
Updated
•
7
Zhang199/TinyLLaVA-Qwen2.5-3B-SigLIP
Image-Text-to-Text
•
4B
•
Updated
•
3
Zhang199/TinyLLaVA-Video-R1
Video-Text-to-Text
•
4B
•
Updated
•
68
•
4
Zhang199/TinyLLaVA-Video-Coldstart_NextQA_16
Video-Text-to-Text
•
4B
•
Updated
•
3
•
1
Zhang199/TinyLLaVA-Video-Qwen2.5-3B-Group-1fps-512
Video-Text-to-Text
•
4B
•
Updated
•
62