MiniGPT-and-DeepSeek-MLA-Multi-Head-Latent-Attention

Public

An efficient and scalable attention module designed to reduce memory usage and improve inference speed in large language models. Designed and implemented the Multi-Head Latent Attention (MLA) module as a drop-in replacement for traditional multi-head attention (MHA) in large language models.

attention-mechanism deepseek llm mla multi-head-attention pytorch

Creat：2025-04-08T23:10:50

Update：2025-04-09T06:41:35

Stars

Stars Increase

Related projects

Langchain

Hot

Elixir implementation of a LangChain style framework that lets Elixir projects integrate with and leverage LLMs.

120106

1年前

+87today

Dify

Hot

agent

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

119377

8个月前

+109today

LLMs From Scratch

Hot

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

79046

1年前

+69today

Gpt4all

ai-chat

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

76928

8个月前

+8today

Browser Use

ai-agents

Make websites accessible for AI agents

72771

9个月前

+50today

Lobe Chat

? Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Plugins/Artifacts) and Thinking. One-click FREE deployment of your private ChatGPT/ Claude / DeepSeek application.

67873

9个月前

+29today

Annotated_deep_learning_paper_implementations

attention

??? 60+ Implementations/tutorials of deep learning papers with side-by-side notes ?; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), ? reinforcement learning (ppo, dqn), capsnet, distillation, ... ?

64391

8个月前

+30today