FineInfer
PublicDeferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)
Creat:2024-02-27T19:31:37
Update:2025-03-16T02:05:13
https://dl.acm.org/doi/10.1145/3642970.3655835
17
Stars
0
Stars Increase
Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)