Issues with config file

by Authentic1957 - opened May 18, 2024

Authentic1957

May 18, 2024

In config.json rms_norm is set to True, but the pre-training model contains bias parameters such as backbone.layers.1.norm.bias, so it is necessary to set rms_norm to False. Also, in config_mamba.py, MambaConfig does not contain the pad_id member variable, so the "pad_id": 0 needs to be removed.

Note: All changes are made in mamba-ssm==1.2.0. After that you can load the pre-trained model and run it without any problem.

Also, the input_id line in the example code may better be changed to:

input_ids = torch.from_numpy(text_byte[None, :].copy()).long().cuda()

It is a great paper! Thanks to all the authors for this wonderful research.

JunxiongWang

Owner May 18, 2024

Thanks a lot! That really helps.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment