shredword
PublicFast & efficient BPE tokenizer written in C & python for LLM tranining
natural-language-processingsubword-segmentationsubword-tokenizationtiktokentokenizationtokenizerword-segmentation
Creat:2024-08-29T13:34:00
Update:2025-05-27T05:47:55
0
Stars
0
Stars Increase