Mega-pytorch
PublicImplementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
Creat:2022-09-24T04:40:57
Update:2024-12-26T11:09:43
203
Stars
0
Stars Increase
Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena