native-sparse-attention-pytorch
PublicImplementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
Creat:2025-02-19T11:37:52
Update:2025-03-27T05:09:21
715
Stars
1
Stars Increase
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper