Collections
Discover the best community collections!
Collections including paper arxiv:2510.12403
-
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control
Paper • 2508.21112 • Published • 75 -
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
Paper • 2509.09372 • Published • 231 -
Robot Learning: A Tutorial
Paper • 2510.12403 • Published • 85
-
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning
Paper • 2503.10480 • Published • 55 -
A Survey on Vision-Language-Action Models: An Action Tokenization Perspective
Paper • 2507.01925 • Published • 38 -
Robot Learning: A Tutorial
Paper • 2510.12403 • Published • 85
-
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights
Paper • 2509.22944 • Published • 76 -
Robot Learning: A Tutorial
Paper • 2510.12403 • Published • 85 -
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE
Paper • 2510.13344 • Published • 60 -
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding
Paper • 2510.06308 • Published • 51
-
EgoTwin: Dreaming Body and View in First Person
Paper • 2508.13013 • Published • 20 -
Robot Learning: A Tutorial
Paper • 2510.12403 • Published • 85 -
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
Paper • 2510.19430 • Published • 36 -
VLA-0: Building State-of-the-Art VLAs with Zero Modification
Paper • 2510.13054 • Published • 8
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 132 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27
-
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights
Paper • 2509.22944 • Published • 76 -
Robot Learning: A Tutorial
Paper • 2510.12403 • Published • 85 -
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE
Paper • 2510.13344 • Published • 60 -
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding
Paper • 2510.06308 • Published • 51
-
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control
Paper • 2508.21112 • Published • 75 -
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
Paper • 2509.09372 • Published • 231 -
Robot Learning: A Tutorial
Paper • 2510.12403 • Published • 85
-
EgoTwin: Dreaming Body and View in First Person
Paper • 2508.13013 • Published • 20 -
Robot Learning: A Tutorial
Paper • 2510.12403 • Published • 85 -
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
Paper • 2510.19430 • Published • 36 -
VLA-0: Building State-of-the-Art VLAs with Zero Modification
Paper • 2510.13054 • Published • 8
-
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning
Paper • 2503.10480 • Published • 55 -
A Survey on Vision-Language-Action Models: An Action Tokenization Perspective
Paper • 2507.01925 • Published • 38 -
Robot Learning: A Tutorial
Paper • 2510.12403 • Published • 85
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 132 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27