dynamic-batching

Public

The official repo for the paper "Optimizing LLM Inference Throughput via Memory-aware and SLA-constrained Dynamic Batching"

inference-serving llm vllm

निर्माण समय：2025-03-06T15:23:39

अपडेट समय：2025-03-17T16:16:07

Stars

Stars Increase

संबंधित प्रोजेक्ट

Dify

Hot

agent

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

108182

4个月前

+156today

Gpt4all

ai-chat

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

73873

3个月前

+6today

Browser Use

Hot

ai-agents

Make websites accessible for AI agents

66161

5个月前

+72today

LLMs From Scratch

Hot

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

59648

10个月前

+75today

MetaGPT

agent

? The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

57431

3个月前

+29today

LLaMA Factory

Hot

agent

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

54821

3个月前

+62today

Vllm

Hot

amd

A high-throughput and memory-efficient inference and serving engine for LLMs

53050

1年前

+131today

Autogen

Hot

agentic

A programming framework for agentic AI ? PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour

47772

1年前

+51today