AutoProcessor.from_pretrained(model_id, language="pa", task="transcribe") - Error - Transformers does not recognize this architecture - model type `stt`

#10

by jssaluja - opened Sep 16

Sep 16

92 # Load processor
 93 model_id = "kyutai/stt-1b-en_fr"

---> 94 processor = AutoProcessor.from_pretrained(model_id, language="pa", task="transcribe")
95
96 # Compute dynamic max_length from training dataset only

/usr/local/lib/python3.11/dist-packages/transformers/models/auto/processing_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
345 # Next, let's check whether the processor class is saved in a tokenizer
346 tokenizer_config_file = cached_file(
--> 347 pretrained_model_name_or_path, TOKENIZER_CONFIG_FILE, **cached_file_kwargs
348 )
349 if tokenizer_config_file is not None:

/usr/local/lib/python3.11/dist-packages/transformers/models/auto/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
1170 f"The function {fn} should have an empty 'List options' in its docstring as placeholder, current"
1171 f" docstring is:\n{docstrings}"
-> 1172 )
1173 fn.doc = docstrings
1174 return fn

ValueError: The checkpoint you are trying to load has model type stt but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

jssaluja changed discussion title from Transformers does not recognize this architecture - model type `stt` to AutoProcessor.from_pretrained(model_id, language="pa", task="transcribe") - Error - Transformers does not recognize this architecture - model type `stt` Sep 16

lmz

Kyutai org Sep 16

This repo is not to be used with transformers but rather with our production codebase. You can find more details in the main repo. There is also a transformers implementation kyutai/stt-1b-en_fr-trfs but it's less efficient and less well supported than our own one.

jssaluja

Sep 16

@lmz Thank you for rhe details.
Can you point me to python script / notebook for fine tuning kyutai stt for a hf dataset

lmz

Kyutai org Sep 17

I wouldn't know of such a script at the moment. If you get one to work that would be welcome.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment