Fix tokenizer
#16
by
stephantulkens
- opened
Hello! The tokenizer used contains an incorrect redundant pretokenizer. This can lead downstream tools to believe that pretokenization (e.g., splitting is happening), when it is not. Would you accept PRs for this?
Hi @stephantulkens ,
Welcome to Google's Gemma family of open source models, thanks for bringing this to our attention. Yes, Gemma models are open source and can accepts the community contributions. Please raise a PR for the changes with the necessary details once it's reviewed the PR is going to be merged.
Thanks.
Ok thanks! I made a PR separately.
stephantulkens
changed discussion status to
closed