BLAST
Public[NeurIPS 2024] BLAST: Block Level Adaptive Structured Matrix for Efficient Deep Neural Network Inference
efficient-inferencelarge-language-modelsllamamatrix-factorizationmatrix-multiplicationmodel-compression
Creat:2024-09-28T01:24:51
Update:2025-03-05T03:21:20
541
Stars
1
Stars Increase