mixture-of-experts
PublicA Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
Creat:2020-07-14T02:23:35
Update:2025-03-27T11:46:42
789
Stars
1
Stars Increase
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models