KVQuant

[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

compression efficient-inference efficient-model large-language-models llama llm localllama localllm mistral model-compression

Heure de création：2024-02-01T01:30:10

Heure de mise à jour：2025-03-25T10:33:15

https://arxiv.org/abs/2401.18079

394

Stars

Stars Increase

Projets connexes

Transformers

Hot

bert

? Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

153615

2年前

+136today

Generative Ai For Beginners

Hot

21 Lessons, Get Started Building with Generative AI ? https://microsoft.github.io/generative-ai-for-beginners/

102828

11个月前

+172today

LLMs From Scratch

Hot

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

80697

1年前

+221today

Gpt4all

ai-chat

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

76955

10个月前

+7today

Vllm

Hot

amd

A high-throughput and memory-efficient inference and serving engine for LLMs

64909

1年前

+236today

LLaMA Factory

Hot

agent

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

63700

10个月前

+147today

Whisper.Cpp

Hot

inference

Port of OpenAI's Whisper model in C/C++

44989

10个月前

+81today

TTS

Hot

deep-learning

?? - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

43779

12个月前

+69today

ColossalAI

Making large AI models cheaper, faster and more accessible

41288

10个月前

+6today

DeepSpeed

billion-parameters

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

40940

10个月前

+29today

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Services

AI Model Compatibility Checker

AI Deployment Calculator

KVQuant

Projets connexes

Transformers

Generative Ai For Beginners

LLMs From Scratch

Gpt4all

Vllm

LLaMA Factory

Whisper.Cpp

TTS

ColossalAI

DeepSpeed

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

KVQuant

Projets connexes

Transformers

Generative Ai For Beginners

LLMs From Scratch

Gpt4all

Vllm

LLaMA Factory

Whisper.Cpp

TTS

ColossalAI

DeepSpeed

GEO Services