Fix tokenization: add simple_tok parameter (default=True) to match original errant script a2eb45d verified marksverdhei commited on 22 days ago