KVzip
PublicQuery-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)
Creat:2025-05-28T15:52:38
Update:2025-06-16T17:37:17
96
Stars
0
Stars Increase
Query-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)