The following error occurs after the last change commit d99b7e342d11f177f80f7feb121a14f4cc0728bd

#40
by zaidma - opened

RuntimeError: Error(s) in loading state_dict for Embedding: size mismatch for weight: copying a param with shape torch.Size([1026, 768]) from checkpoint, the shape in current model is torch.Size([8192, 768]).

zaidma changed discussion title from Der folgende Fehler tritt nach der letzten Änderungen (Commit d99b7e342d11f177f80f7feb121a14f4cc0728bd) auf. to The following error occurs after the last change commit d99b7e342d11f177f80f7feb121a14f4cc0728bd

I have the same error issue for the last update (Commit d99b7e342d11f177f80f7feb121a14f4cc0728bd)

I changed "max_position_embeddings" back to 1026, so it works now

Yes it's only been changed in the configs while the weights only exist for sequence length of 1026. When we ignore mismatched sizes, we get a bit more info:

Some weights of XLMRobertaForSequenceClassification were not initialized from the model checkpoint at jinaai/jina-reranker-v2-base-multilingual and are newly initialized because the shapes did not match:
- roberta.embeddings.position_embeddings.weight: found shape torch.Size([1026, 768]) in the checkpoint and torch.Size([8192, 768]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

It seems interesting that this occurs on the position embeddding layer, not the word embeddings...

@numb3r3 i have reverted your last three commits from 17hours ago. we need to look at what's wrong with those.

Jina AI org

Got it. I will raise a new PR later for fixing issue with empty passage.

numb3r3 changed discussion status to closed

Sign up or log in to comment