fix: update `generation_config.json`to default to stochastic sampling (temp 0.15)

#18

by casinca - opened 9 days ago

base: refs/heads/main

←

from: refs/pr/18

Discussion Files changed

-1

casinca

9 days ago

•

edited 9 days ago

Hello,

This PR adds the required hparam arguments to enable stochastic sampling (temp 0.15) rather than greedy decoding in the generation_config.json.
So when users load the Devstral-2-123B-Instruct-2512 model, they automatically get the default sampling settings intended by Mistral.

Motivation: Not all users might know about these sampling hparams and what they do, defaulting to what Mistral recommends, could lower complaints about potential poor generations/model performances.

Since there was "_from_model_config": true, I checked config.jsonfirst but there were no sampling hparam declared there, hence my changes here.

Opened as a separate PR in case you want to keep it greedy by default, this is linked to: https://huggingface.co/mistralai/Devstral-2-123B-Instruct-2512/discussions/9
(It's true that low temp is close to greedy but this is still stochastic nonetheless)

fix: update `generation_config.json`to default to stochastic sampling (temp 0.15)bd3294fb

juliendenize

Mistral AI_ org 8 days ago

Thanks !

juliendenize changed pull request status to merged 8 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment