FileNotFoundError when loading pretrained whisper-large-v3 via from_hparams

#1
by Herofis - opened

Hello,

I am encountering a FileNotFoundError when trying to load a pretrained model (Elyadata/ADI-whisper-ADI20) that depends on openai/whisper-large-v3. It seems the SpeechBrain Hugging Face integration is not correctly finding the checkpoint files for whisper-large-v3.

from speechbrain.lobes.models.huggingface_whisper import WhisperDialectClassifier

dialect_id = WhisperDialectClassifier.from_hparams(
    source="Elyadata/ADI-whisper-ADI20",
    hparams_file="hyperparams.yaml",
    savedir="pretrained_DID/tmp"
).to("cuda" if torch.cuda.is_available() else "cpu")

dialect_id.device = "cuda" if torch.cuda.is_available() else "cpu"

INFO:speechbrain.utils.fetching:Fetch hyperparams.yaml: Using symlink found at '/content/ADI-20/pretrained_DID/tmp/hyperparams.yaml'
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/tmp/ipython-input-2547517891.py in <cell line: 0>()
      6 # Note: You might need the actual hparams.yaml and pretrained model files
      7 # to be present in the specified directories.
----> 8 dialect_id = WhisperDialectClassifier.from_hparams(
      9     source="Elyadata/ADI-whisper-ADI20",
     10     hparams_file="hyperparams.yaml", # This path is relative to the model/repo structure

<...omitted frames...>

/usr/local/lib/python3.12/dist-packages/speechbrain/integrations/huggingface/huggingface.py in _check_model_source(self, path, save_path)
    300 
    301         err_msg = f"{path} does not contain a .bin, .safetensors or .ckpt checkpoint !"
--> 302         raise FileNotFoundError(err_msg)
    303 
    304     def _modify_state_dict(self, path, **kwargs):

FileNotFoundError: openai/whisper-large-v3 does not contain a .bin, .safetensors or .ckpt checkpoint !

Sign up or log in to comment