AIbase

dynamic-batching

Public

The official repo for the paper "Optimizing LLM Inference Throughput via Memory-aware and SLA-constrained Dynamic Batching"

Erstellungszeit2025-03-06T15:23:39
Aktualisierungszeit2025-03-17T16:16:07
10
Stars
0
Stars Increase