RL - a JM-Brun Collection

JM-Brun 's Collections

RL

Diffusion models

Prompt Optimization

Tabular

Agents

SLMs

LLM-KG

LLM Architecture

Interpretability XAI

RL

updated 25 days ago

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 183
Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published 28 days ago • 67