megabyte
PublicA PyTorch implementation of MEGABYTE. This multi-scale transformer architecture has the excellent features of tokenization-free and sub-quadratic attention. The paper link: https://arxiv.org/abs/2305.07185
Creat:2023-06-01T22:08:51
Update:2025-03-20T09:25:42
7
Stars
0
Stars Increase