Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
sergiopaniego 
posted an update 16 days ago
Post
2371
Online training methods (e.g., GRPO) require real-time generation, a compute- and memory-heavy bottleneck.

TRL has built-in vLLM support and in this new recipe, we show how to leverage it for efficient online training. Run on Colab ⚡, scale to multi-GPU/multi-node!

🧑‍🍳 recipe: https://huggingface.co/learn/cookbook/grpo_vllm_online_training

Thanks for sharing!