HomeAI Tutorial

mamba-slm-hybrid-optimizer

Public

Small Language Model Implementation based on Mamba (SSM) architecture. Muon optimizer to the 2D weight matrices while using the stable AdamW optimizer for all other parameters.

Creat2025-10-26T10:12:26
Update2025-10-30T09:58:17
0
Stars
0
Stars Increase

Related projects