spatten
Public[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Creat:2023-10-29T03:37:41
Update:2025-03-11T09:47:47
https://hanlab.mit.edu/projects/spatten
100
Stars
0
Stars Increase
[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning