Efficiently-Serving-LLMs

Public

Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

batch-processing deep-learning-techniques inference-optimization large-scale-deployment machine-learning-operations model-acceleration model-inference-service model-serving optimization-techniques performance-enhancement

Creat：2024-03-28T00:36:49

Update：2024-06-20T01:15:50

https://www.deeplearning.ai/short-courses/efficiently-serving-llms/

Stars

Stars Increase

Related projects

Tensorflow

deep-learning

An Open Source Machine Learning Framework for Everyone

191989

2年前

+12today

Stable Diffusion Webui

Stable Diffusion web UI

157218

1年前

+37today

Transformers

Hot

bert

? Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

150972

2年前

+97today

30 Seconds Of Code

astro

Coding articles to level up your development skills

125486

6个月前

+12today

Pytorch

autograd

Tensors and Dynamic neural networks in Python with strong GPU acceleration

93870

6个月前

+43today

Comfyui

Hot

ComfyUI docker images for use in GPU cloud and local environments. Includes AI-Dock base for authentication and improved user experience.

90791

6个月前

+138today

Opencv

c-plus-plus

Open Source Computer Vision Library

84455

7年前

+29today

Gpt4all

ai-chat

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

76797

6个月前

+18today

Netdata

alerting

X-Ray Vision for your infrastructure!

76359

6个月前

+19today

Deep Live Cam

real time face swap and one-click video deepfake with only a single image

73953

8个月前

+17today

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Efficiently-Serving-LLMs

Related projects

Tensorflow

Stable Diffusion Webui

Transformers

30 Seconds Of Code

Pytorch

Comfyui

Opencv

Gpt4all

Netdata

Deep Live Cam

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Efficiently-Serving-LLMs

Related projects

Tensorflow

Stable Diffusion Webui

Transformers

30 Seconds Of Code

Pytorch

Comfyui

Opencv

Gpt4all

Netdata

Deep Live Cam

GEO Services