Issues with config file
#1
by
Authentic1957
- opened
In config.json rms_norm is set to True, but the pre-training model contains bias parameters such as backbone.layers.1.norm.bias, so it is necessary to set rms_norm to False. Also, in config_mamba.py, MambaConfig does not contain the pad_id member variable, so the "pad_id": 0 needs to be removed.
Note: All changes are made in mamba-ssm==1.2.0. After that you can load the pre-trained model and run it without any problem.
Also, the input_id line in the example code may better be changed to:
input_ids = torch.from_numpy(text_byte[None, :].copy()).long().cuda()
It is a great paper! Thanks to all the authors for this wonderful research.
Thanks a lot! That really helps.