New seq2seq tool: search hparam space with run_eval.py

stas · September 16, 2020, 8:21pm

FYI, there is a new tool available to you - you can now search the hparam space with run_eval.py.

It’s called run_eval_search.py

It uses the same arguments as run_eval.py, but allows you to parametrize the hparams, so in addition to the normal args you can pass:

--search="num_beams=8:11:15 length_penalty=0.9:1.0:1.1 early_stopping=true:false"

and it’ll search all the possible combinations and at the end print a table of results sorted by the scores of the task. e.g.:


bleu  | num_beams | length_penalty | early_stopping
----- | --------- | -------------- | --------------
41.35 |        11 |            1.1 |              0
41.33 |        11 |            1.0 |              0
41.33 |        11 |            1.1 |              1
41.32 |        15 |            1.1 |              0
41.29 |        15 |            1.1 |              1
41.28 |        15 |            1.0 |              0
41.25 |         8 |            1.1 |              0
41.24 |        11 |            1.0 |              1
41.23 |        11 |            0.9 |              0
41.20 |        15 |            1.0 |              1
41.18 |         8 |            1.0 |              0

You can have one or more params searched.

Here is an example of a full command:

PYTHONPATH="src:examples/seq2seq" python examples/seq2seq/run_eval_search.py \
facebook/wmt19-$PAIR $DATA_DIR/val.source $SAVE_DIR/test_translations.txt \
--reference_path $DATA_DIR/val.target --score_path $SAVE_DIR/test_bleu.json \
--bs $BS --task translation \
--search="num_beams=1:5 length_penalty=0.9:1.1 early_stopping=true:false"

If you encounter any issues please let me know.

It’s documented here: https://github.com/huggingface/transformers/blob/master/examples/seq2seq/README.md#run_eval-tips-and-tricks. @sshleifer and I added some more goodies in run_eval.py - you will find them all documented at that url.

Enjoy.

p.s. edited to remove things that are going to change based on Sam’s comment below.

sshleifer · September 16, 2020, 10:07pm

Great work!

There are only two possible sets of keys to get from run_eval.py since
score_fn = calculate_bleu_score if "translation" in args.task else calculate_rouge

You shouldn’t hard code the possible tasks any more than that IMO.

stas · September 16, 2020, 10:25pm

ah, thank you for clarifying that - I will adjust it to follow the same logic.

valhalla · September 17, 2020, 6:04am

This is awesome ! Thanks @stas

BramVanroy · September 17, 2020, 11:31am

I haven’t checked the code, I’m on mobile now. But are there many scenarios where we actually need to do hyperparameters search on the evaluation/inference side? In addition, does this use the optuna implementation that is being worked on in the trainer by @sgugger , or is it a separate implementation?

valhalla · September 17, 2020, 12:11pm

When you train a seq2seq model on new summ or translation dataset or other seq2seq task and want to decide how many beams to use, should use length penalty or not, what should be the max seq length, what should be the no_repeat_ngram_size etc, all of these parameter affect the metrics , so this tool helps to make those decisions,

It does not use optuna, it just uses itetools.product to enumerate the different combinations and evaluate on them

Topic		Replies	Views
Predict beam size on Seq2SeqTrainer 🤗Transformers	1	282	July 15, 2021
Using Seq2SeqTrainer to eval during training? 🤗Transformers	1	1060	November 27, 2021
Is there a notebook or document for hyperparameter search? 🤗Transformers	2	338	July 14, 2021
How to use optuna or raytune to search for parameters not in TrainingArguments? 🤗Transformers	0	198	August 30, 2021
[seq2seq] Run distributed eval somewhat faster than run_eval 🤗Transformers	0	261	October 28, 2020

New seq2seq tool: search hparam space with run_eval.py

Related topics