dynamic-batching
PublicThe official repo for the paper "Optimizing LLM Inference Throughput via Memory-aware and SLA-constrained Dynamic Batching"
Creat:2025-03-06T15:23:39
Update:2025-03-17T16:16:07
10
Stars
0
Stars Increase
The official repo for the paper "Optimizing LLM Inference Throughput via Memory-aware and SLA-constrained Dynamic Batching"