view article Article There is no such thing as a tokenizer-free lunch By catherinearnett • 30 days ago • 83
view reply I never understood how one can consider UTF8-byte "no-tokenizer" to be natural if it uses 1 byte-per-char for English and perhaps 2 bytes-per-char for Greek/Cyrillic etc.