Gibran Iqbal's picture

Gibran Iqbal PRO

Jibbscript

·

AI & ML interests

None yet

Recent Activity

liked a model about 6 hours ago

ibm-granite/granite-timeseries-ttm-r2

upvoted an article about 6 hours ago

Building the Open Agent Ecosystem Together: Introducing OpenEnv

upvoted a paper 1 day ago

Unified Reinforcement and Imitation Learning for Vision-Language Models

View all activity

Organizations

upvoted an article about 6 hours ago

Article

Building the Open Agent Ecosystem Together: Introducing OpenEnv

2 days ago

• 71

upvoted 5 papers 1 day ago

Unified Reinforcement and Imitation Learning for Vision-Language Models

Paper • 2510.19307 • Published 3 days ago • 21

A^2Search: Ambiguity-Aware Question Answering with Reinforcement Learning

Paper • 2510.07958 • Published 16 days ago • 4

LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts

Paper • 2510.19363 • Published 3 days ago • 54

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

Paper • 2510.18927 • Published 4 days ago • 77

Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning

Paper • 2510.19338 • Published 3 days ago • 90

upvoted a paper 2 days ago

LightMem: Lightweight and Efficient Memory-Augmented Generation

Paper • 2510.18866 • Published 4 days ago • 99

upvoted a paper 4 days ago

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Paper • 2510.09608 • Published 15 days ago • 47

upvoted 2 papers 6 days ago

Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published 12 days ago • 30

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published 9 days ago • 95

upvoted 5 papers 8 days ago

Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs

Paper • 2510.13795 • Published 10 days ago • 49

rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28 • 113

Stronger Together: On-Policy Reinforcement Learning for Collaborative LLMs

Paper • 2510.11062 • Published 12 days ago • 25

Generative Universal Verifier as Multimodal Meta-Reasoner

Paper • 2510.13804 • Published 10 days ago • 24

Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

Paper • 2510.13554 • Published 10 days ago • 54

upvoted 2 papers 9 days ago

Dr.LLM: Dynamic Layer Routing in LLMs

Paper • 2510.12773 • Published 11 days ago • 30

A Survey of Vibe Coding with Large Language Models

Paper • 2510.12399 • Published 11 days ago • 45

upvoted 3 papers 11 days ago

Contextual Document Embeddings

Paper • 2410.02525 • Published Oct 3, 2024 • 24

R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?

Paper • 2510.08189 • Published 16 days ago • 25

AutoPR: Let's Automate Your Academic Promotion!

Paper • 2510.09558 • Published 15 days ago • 49