Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.19147

about 22 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Paper • 2505.19147 • Published May 25 • 144

Positions and Surveys

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Paper • 2505.19147 • Published May 25 • 144

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31 • 300
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 300
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Paper • 2503.24235 • Published Mar 31 • 54
Seedream 3.0 Technical Report

Paper • 2504.11346 • Published Apr 15 • 70

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 84

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 274
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 185
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 138
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97

TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations

Paper • 2505.18125 • Published May 23 • 112
On-Policy RL with Optimal Reward Baseline

Paper • 2505.23585 • Published May 29 • 14
Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering

Paper • 2505.23604 • Published May 29 • 23
Are Reasoning Models More Prone to Hallucination?

Paper • 2505.23646 • Published May 29 • 24

mytestcollection

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Paper • 2505.19147 • Published May 25 • 144

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7 • 110
MoCha: Towards Movie-Grade Talking Character Synthesis

Paper • 2503.23307 • Published Mar 30 • 138
Towards Understanding Camera Motions in Any Video

Paper • 2504.15376 • Published Apr 21 • 158
Antidistillation Sampling

Paper • 2504.13146 • Published Apr 17 • 59

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

Paper • 2411.04952 • Published Nov 7, 2024 • 30
Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

Paper • 2411.05005 • Published Nov 7, 2024 • 13
M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models

Paper • 2411.04075 • Published Nov 6, 2024 • 17
Self-Consistency Preference Optimization

Paper • 2411.04109 • Published Nov 6, 2024 • 19

about 22 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 274
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 185
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 138
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Paper • 2505.19147 • Published May 25 • 144

TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations

Paper • 2505.18125 • Published May 23 • 112
On-Policy RL with Optimal Reward Baseline

Paper • 2505.23585 • Published May 29 • 14
Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering

Paper • 2505.23604 • Published May 29 • 23
Are Reasoning Models More Prone to Hallucination?

Paper • 2505.23646 • Published May 29 • 24

Positions and Surveys

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Paper • 2505.19147 • Published May 25 • 144

mytestcollection

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Paper • 2505.19147 • Published May 25 • 144

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31 • 300
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 300
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Paper • 2503.24235 • Published Mar 31 • 54
Seedream 3.0 Technical Report

Paper • 2504.11346 • Published Apr 15 • 70

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7 • 110
MoCha: Towards Movie-Grade Talking Character Synthesis

Paper • 2503.23307 • Published Mar 30 • 138
Towards Understanding Camera Motions in Any Video

Paper • 2504.15376 • Published Apr 21 • 158
Antidistillation Sampling

Paper • 2504.13146 • Published Apr 17 • 59

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 84

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

Paper • 2411.04952 • Published Nov 7, 2024 • 30
Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

Paper • 2411.05005 • Published Nov 7, 2024 • 13
M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models

Paper • 2411.04075 • Published Nov 6, 2024 • 17
Self-Consistency Preference Optimization

Paper • 2411.04109 • Published Nov 6, 2024 • 19

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs