llm-runtime
PublicA Unified Cross-Instance Scheduling Runtime for Heterogeneous Inference Engines (vLLM, SGLang, etc.), Supporting Multi-Level KV Cache Storage, Global Prefix Affinity Awareness, PD Disaggregation, and Advanced Intelligent Scheduling Features, etc.