AutoProcessor.from_pretrained(model_id, language="pa", task="transcribe") - Error - Transformers does not recognize this architecture - model type `stt`

#10
by jssaluja - opened
92 # Load processor
 93 model_id = "kyutai/stt-1b-en_fr"

---> 94 processor = AutoProcessor.from_pretrained(model_id, language="pa", task="transcribe")
95
96 # Compute dynamic max_length from training dataset only

/usr/local/lib/python3.11/dist-packages/transformers/models/auto/processing_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
345 # Next, let's check whether the processor class is saved in a tokenizer
346 tokenizer_config_file = cached_file(
--> 347 pretrained_model_name_or_path, TOKENIZER_CONFIG_FILE, **cached_file_kwargs
348 )
349 if tokenizer_config_file is not None:

/usr/local/lib/python3.11/dist-packages/transformers/models/auto/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
1170 f"The function {fn} should have an empty 'List options' in its docstring as placeholder, current"
1171 f" docstring is:\n{docstrings}"
-> 1172 )
1173 fn.doc = docstrings
1174 return fn

ValueError: The checkpoint you are trying to load has model type stt but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

jssaluja changed discussion title from Transformers does not recognize this architecture - model type `stt` to AutoProcessor.from_pretrained(model_id, language="pa", task="transcribe") - Error - Transformers does not recognize this architecture - model type `stt`
Kyutai org

This repo is not to be used with transformers but rather with our production codebase. You can find more details in the main repo. There is also a transformers implementation kyutai/stt-1b-en_fr-trfs but it's less efficient and less well supported than our own one.

@lmz Thank you for rhe details.
Can you point me to python script / notebook for fine tuning kyutai stt for a hf dataset

Kyutai org

I wouldn't know of such a script at the moment. If you get one to work that would be welcome.

Sign up or log in to comment