AIbase

Efficiently-Serving-LLMs

Public

Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

Creat2024-03-28T00:36:49
Update2024-06-20T01:15:50
https://www.deeplearning.ai/short-courses/efficiently-serving-llms/
17
Stars
0
Stars Increase