KVzip
PublicQuery-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)
创建时间:2025-05-28T15:52:38
更新时间:2025-07-29T13:36:21
93
Stars
0
Stars Increase
Query-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)