According to an exclusive report by BaiMing Lab, the highly anticipated DeepSeek V4 and Yao Shunyu's new MixFormer model are scheduled to be officially released in April 2026. DeepSeek V4 is a multimodal large model developed under the leadership of Liang Wenfeng. After long-term refinement and improvement, it is expected to make significant progress in code capabilities and long-term memory. The release of this model aligns with the research direction of the DeepSeek team in recent years, especially in visual content processing and AI search capabilities.

Liang Wenfeng's research focuses on the exploration of "conditional memory" mechanisms. In January 2026, he published a paper titled "Conditional Memory via Scalable Lookup," proposing relevant theories. Additionally, in December 2025, he released another paper titled "mHC: Manifold-Constrained Hyper-Connections," further optimizing the underlying architecture. These studies aim to address the shortcomings of Transformer models in terms of memory and training stability. DeepSeek V4 not only has strong multimodal processing capabilities but will also be deeply adapted to domestic chips, striving to become the first core model that fully relies on domestic computing power.

At the same time, Yao Shunyu's new MixFormer model will also be released in April. Since December 2025, Yao Shunyu has served as the Chief AI Scientist at Tencent's Executive Committee, while also overseeing the AI Infra department and the large language model department. In February 2026, he published a paper called CL-bench, introducing a new evaluation benchmark for "contextual learning," emphasizing the importance of long context and Agent usability. According to related information, Yao Shunyu's new model will have approximately 3 billion parameters, and his team has emphasized practical applications of the model from the very beginning, rather than focusing solely on parameter competition.

The releases of both models have attracted significant market attention and indicate the rapid development of China in the field of artificial intelligence. Whether it is the long-term memory capabilities of DeepSeek V4 or the progress of Tencent's MixFormer model in real task evaluations, they are all striving to answer how future large models can better adapt to production environments.