@sergiopaniego on Hugging Face: "Online training methods (e.g., GRPO) require real-time generation, a compute-…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

sergiopaniego

posted an update 16 days ago

Post

2371

Online training methods (e.g., GRPO) require real-time generation, a compute- and memory-heavy bottleneck.

TRL has built-in vLLM support and in this new recipe, we show how to leverage it for efficient online training. Run on Colab ⚡, scale to multi-GPU/multi-node!

🧑‍🍳 recipe: https://huggingface.co/learn/cookbook/grpo_vllm_online_training

SelmaNajih001

16 days ago

Thanks for sharing!

In this post

sergiopaniego Sergio Paniego
SelmaNajih001 Selma Najih