DeepSeek Launches Engram Module: Implanting Conditional Memory Axes into Sparse Large Models, Efficiency Significantly Improved
The DeepSeek team launched the Engram module, introducing a 'conditional memory axis' into sparse large language models, aiming to address the issue of computational resource waste when traditional Transformers process repetitive knowledge. This module, as a complement to the mixture-of-experts model, integrates N-gram embedding technology into the model, improving efficiency in processing repetitive patterns.