Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2512.22615

about 21 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 311 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 7 days ago • 39
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Paper • 2512.20557 • Published 11 days ago • 48
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Paper • 2512.16093 • Published 17 days ago • 90

扩散模型_based

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 7 days ago • 39

segmentation plus report

ReXGroundingCT: A 3D Chest CT Dataset for Segmentation of Findings from Free-Text Reports

Paper • 2507.22030 • Published Jul 29, 2025
Unlocking the Potential of MLLMs in Referring Expression Segmentation via a Light-weight Mask Decode

Paper • 2508.04107 • Published Aug 6, 2025 • 4
Phrase-grounded Fact-checking for Automatically Generated Chest X-ray Reports

Paper • 2509.21356 • Published Sep 20, 2025
Learning Segmentation from Radiology Reports

Paper • 2507.05582 • Published Jul 8, 2025 • 1

Multimodal Agent

Gemini Robotics: Bringing AI into the Physical World

Paper • 2503.20020 • Published Mar 25, 2025 • 29
Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published Feb 18, 2025 • 58
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 51
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Paper • 2410.23218 • Published Oct 30, 2024 • 49

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 7 days ago • 39
PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence

Paper • 2512.16793 • Published 16 days ago • 72

Diffusion models

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 7 days ago • 39
LLaDA2.0: Scaling Up Diffusion Language Models to 100B

Paper • 2512.15745 • Published 24 days ago • 78

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 7 days ago • 39

about 2 hours ago

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 3.57M • • 4.31k
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Paper • 2512.20605 • Published 11 days ago • 59
Nested Browser-Use Learning for Agentic Information Seeking

Paper • 2512.23647 • Published 5 days ago • 17
TimeBill: Time-Budgeted Inference for Large Language Models

Paper • 2512.21859 • Published 8 days ago • 19

Tracking Any Object Amodally

Paper • 2312.12433 • Published Dec 19, 2023 • 12
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 86
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models

Paper • 2503.18886 • Published Mar 24, 2025 • 24
Diffusion Models without Classifier-free Guidance

Paper • 2502.12154 • Published Feb 17, 2025 • 8

about 21 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 311 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 7 days ago • 39
PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence

Paper • 2512.16793 • Published 16 days ago • 72

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 7 days ago • 39
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Paper • 2512.20557 • Published 11 days ago • 48
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Paper • 2512.16093 • Published 17 days ago • 90

Diffusion models

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 7 days ago • 39
LLaDA2.0: Scaling Up Diffusion Language Models to 100B

Paper • 2512.15745 • Published 24 days ago • 78

扩散模型_based

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 7 days ago • 39

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 7 days ago • 39

segmentation plus report

ReXGroundingCT: A 3D Chest CT Dataset for Segmentation of Findings from Free-Text Reports

Paper • 2507.22030 • Published Jul 29, 2025
Unlocking the Potential of MLLMs in Referring Expression Segmentation via a Light-weight Mask Decode

Paper • 2508.04107 • Published Aug 6, 2025 • 4
Phrase-grounded Fact-checking for Automatically Generated Chest X-ray Reports

Paper • 2509.21356 • Published Sep 20, 2025
Learning Segmentation from Radiology Reports

Paper • 2507.05582 • Published Jul 8, 2025 • 1

about 2 hours ago

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 3.57M • • 4.31k
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Paper • 2512.20605 • Published 11 days ago • 59
Nested Browser-Use Learning for Agentic Information Seeking

Paper • 2512.23647 • Published 5 days ago • 17
TimeBill: Time-Budgeted Inference for Large Language Models

Paper • 2512.21859 • Published 8 days ago • 19

Multimodal Agent

Gemini Robotics: Bringing AI into the Physical World

Paper • 2503.20020 • Published Mar 25, 2025 • 29
Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published Feb 18, 2025 • 58
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 51
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Paper • 2410.23218 • Published Oct 30, 2024 • 49

Tracking Any Object Amodally

Paper • 2312.12433 • Published Dec 19, 2023 • 12
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 86
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models

Paper • 2503.18886 • Published Mar 24, 2025 • 24
Diffusion Models without Classifier-free Guidance

Paper • 2502.12154 • Published Feb 17, 2025 • 8

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs