native-sparse-attention-pytorch
PublicImplementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper