Q-LLM

Public

This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"

fast-inference inference-acceleration kv-cache-compression large-language-models long-context

Creat：2024-06-11T15:45:03

Update：2025-01-19T20:04:48

https://arxiv.org/abs/2406.07528

Stars

Stars Increase

Related projects

Gpt4all

ai-chat

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

73703

3个月前

+2today

Redis

cache

For developers, who are building real-time data-driven applications, Redis is the preferred, fastest, and most feature-rich cache, data structure server, and document and vector query engine.

69791

1周前

+7today

Vllm

Hot

amd

A high-throughput and memory-efficient inference and serving engine for LLMs

50831

1年前

+60today

Whisper.Cpp

inference

Port of OpenAI's Whisper model in C/C++

41064

3个月前

+21today

ColossalAI

Making large AI models cheaper, faster and more accessible

41001

3个月前

+1today

DeepSpeed

billion-parameters

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

39100

3个月前

+6today

Ray

data-science

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

37704

3个月前

ChatTTS

agent

A generative speech model for daily dialogue.

36949

3个月前

+13today

AndroidUtilCode

android

:fire: Android developers should collect the following utils(updating).

33591

3个月前

+1today

Real ESRGAN

amine

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

31490

1年前

+5today

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Q-LLM

Related projects

Gpt4all

Redis

Vllm

Whisper.Cpp

ColossalAI

DeepSpeed

Ray

ChatTTS

AndroidUtilCode

Real ESRGAN