llm-runtime
PublicA Unified Cross-Instance Scheduling Runtime for Heterogeneous Inference Engines (vLLM, SGLang, etc.), Supporting Multi-Level KV Cache Storage, Global Prefix Affinity Awareness, PD Disaggregation, and Advanced Intelligent Scheduling Features, etc.
Creat:2025-07-08T23:14:57
Update:2025-07-09T01:45:13
0
Stars
0
Stars Increase