On April 23, Tencent Huan Yuan Hy3preview language model was released and open-sourced. This is a hybrid expert model that integrates fast and slow thinking, with a total of 295B parameters and 21B activated parameters, supporting a maximum context length of 256K. This is the first model trained after the reconstruction of Huan Yuan, and it is also the most intelligent model in Huan Yuan's history, achieving significant improvements in complex reasoning, instruction following, context learning, code, agents, and other capabilities as well as reasoning performance.

In February 2026, Tencent Huan Yuan reconstructed the pre-training and reinforcement learning infrastructure, as well as three principles for pursuing practicality:

1. Systematization of capabilities: It does not emphasize "specialization", because even a single application such as a code agent involves deep collaboration of various abilities such as reasoning, long text, instructions, dialogue, code, and tools.

2. Authentic evaluation: Actively stepping out of easily "scored" public rankings, evaluating and improving the "real combat power" of the model through self-built questions, latest exams, manual evaluation, product crowd testing, and other methods.

3. Pursuit of cost-effectiveness: Practicality cannot be separated from commercial rationality. The design of deep collaborative model architecture and reasoning framework significantly reduces task costs, making intelligence affordable and effective.

Hy3preview can be seen as the beginning of Huan Yuan's rapid exploration of practical large models and solving real-world problems.

Yao Shunyu, Chief AI Scientist at Tencent, stated that Hy3preview is the first step in the reconstruction of the Huan Yuan large model. We hope to gain genuine feedback from the open-source community and users through this open-source and release, helping us improve the practicality of the official version of Hy3. At the same time, we are continuing to expand the scale of pre-training and reinforcement learning, enhance the intelligence ceiling of the model, and through deep Co-Design with many Tencent products, continuously improve the comprehensive performance of the model in real scenarios, and begin to explore special model capabilities.

Currently, Hy3preview has been launched on Tencent Cloud, Yua, ima, CodeBuddy, WorkBuddy, QQ, QQ Browser, Tencent Docs, and Tencent LeXiang. Multiple main products such as the WeChat official account, Peace Elite, Tencent News, Tencent Stock Selection, Tencent Customer Service, and WeChat Reading are also being gradually launched. In addition, Hy3preview supports integration with popular open-source agent products such as OpenClaw, OpenCode, and KiloCode, and has been listed on the Tencent Cloud large model service platform TokenHub.

Hy3preview focuses on comprehensive practicality, with a significant improvement in Agent capabilities

Multiple evaluations show that the capabilities of the Hy3preview model have been comprehensively improved.

1. Excellentcontextlearning and instruction following capabilities

In various real production and life scenarios, understanding messy and lengthy contexts and following complex and changing rules is the primary challenge for the model. Inspired by Tencent's business scenarios, Tencent Huan Yuan proposed CL-bench and CL-bench-Life to innovatively evaluate the model's context learning ability, and significantly improved the context learning and instruction following capabilities of the model in Hy3preview.

image.png

2. Outstanding complex reasoning capability, highest score in Tsinghua University mathematics doctor qualification exam in China

Complex reasoning capability is the foundation for the model to solve various problems. Hy3 preview performed outstandingly in high-difficulty science and engineering reasoning tasks such as FrontierScience-Olympiad and IMOAnswerBench, and achieved excellent results in the latest Tsinghua University Quzhen Academy Mathematics Doctoral Qualification Exam (Spring 2026) and the National High School Biology Competition (CHSBO2025), demonstrating strong generalized reasoning capabilities.

image.png

3. Significant improvements in code and agent capabilities, showing high cost-effectiveness

Code and agent capabilities are the most significant areas of improvement in Hy3preview. Thanks to the reconstruction of the pre-training and reinforcement learning framework and the expansion of the scale of reinforcement learning tasks, Tencent Huan Yuan has quickly achieved competitive results in mainstream code agent benchmarks such as SWE-Bench Verified and Terminal-Bench2.0, as well as mainstream search agent benchmarks such as BrowseComp and WideSearch.

image.png

In the digital world, code focuses on the model's execution ability in development environments, while search focuses on the ability to retrieve, filter, and integrate information in open information spaces. Together, they determine whether the model truly has usability in complex agent scenarios (such as OpenClaw). Hy3preview performed outstandingly in evaluations such as ClawEval and WildClawBench, indicating that our agent capabilities are steadily moving toward comprehensiveness and practicality.

image.png

In addition to public rankings, Tencent Huan Yuan has further built multiple internal evaluation sets to assess the model's performance in real development scenarios. The results showed that regardless of the backend engineering task set Hy-Backend, the Hy-Vibe Bench that is close to real user development interactions, or the high-difficulty software engineering development task set Hy-SWE Max, Hy3preview demonstrated strong competitiveness.

48920987-bdbb-464b-adca-513891f742e1.png

Comparing the size and overall agent performance of various open-source models, Hy3preview shows high cost-effectiveness.

image.png

Tencent core business has been fully integrated, and multiple main AI products have shown obvious benefits

Before the official launch, Hy3preview was tested in Tencent's major AI businesses, resulting in noticeable positive returns.

In the Yua end, Huan Yuan and Yua were deeply co-designed. On one hand, the model's performance in key indicators such as intent understanding accuracy, text creation quality, and deep search was enhanced; on the other hand, the model was finely tuned in terms of writing style, expression, emotional intelligence, content organization, and content professionalism. The deep collaboration between the model and the product brought users a more intelligent and "human-like" interaction experience.

In the scenarios of knowledge base QA and general QA in ima, test results showed that Hy3preview excels in processing long texts, especially in retrieval tasks, where it performs well in the accuracy, coverage, and comprehensiveness of answers.

In CodeBuddy and WorkBuddy products, the first token delay of Hy3preview was reduced by 54%, the end-to-end duration was reduced by 47%, and the success rate was increased to over 99.99%. In actual user environments, Hy3preview has been stably driving complex agent workflows of up to 495 steps, covering diverse office scenarios such as document processing, data analysis, knowledge retrieval, and MCP toolchain orchestration.

In the scenario-specific evaluation of the WeChat official account AI avatar and AI customer service, Hy3preview showed more comprehensive capability upgrades compared to Hy2. The new model performed more maturely in user intent understanding, complex context continuation, and knowledge information organization. When facing ambiguous questions, short sentences, and multi-turn dialogues, it could better grasp user needs and output clearer and more stable responses. When combining knowledge bases, user memory, and context generation, the responses were more in line with the role of the AI avatar and AI customer service, significantly reducing excessive imagination, subjective assumptions, and emotional expressions, making the overall interaction experience closer to the goal of "trustworthy, natural, and efficient" responses.

In the AI NPC scenario of Peace Elite, the Peace Elite team immediately completed the integration and evaluation based on the AI NPC scenario after the launch of Hy3preview, and the overall performance was impressive. In the scenario of character role-playing outside the game, Hy3 Preview not only accurately understood the character setting but also provided highly relevant and value-added content for open-ended questions, bringing a more realistic, natural, and immersive conversation experience. In the complex battle scenario within the game, the model's response rhythm was close to the experience of real players, demonstrating excellent stability and outstanding human-like role-playing capabilities, with overall performance standing out.

In the AI PPT scenario of Tencent Docs, compared to the previous version (Hy2), there were significant improvements: the generation success rate increased by 20%, the evaluation score increased by 10%, and the generation time decreased by 20%. Overall, the new model performed excellently in the evaluation scenarios, showing excellent performance in template selection, color matching, generation outline, and content supplementation stages, without hallucinations, aligned with the theme, and had good visual effects.

In the evaluation of the QQ AI assistant Xiao Q product, compared to the previous version, there were significant optimizations in long text first byte latency, overall response speed, and streaming output efficiency; in core capabilities, mathematical reasoning performance was significantly improved, and multi-scenario instruction following and generalization capabilities were further enhanced; in tool calling reasoning and multi-turn reference resolution, it showed more stable and efficient performance. In the official PinchBench QQ intelligent agent scenario test of OpenClaw, it achieved outstanding results, and the overall experience was significantly improved.

Reasoning efficiency improved by 40%, optimal intelligence density at the same cost 

Thanks to the deep collaboration between the model and the reasoning framework, as well as comprehensive optimization in the reasoning framework, operator performance, quantization algorithms, etc., the overall reasoning efficiency has improved by 40%, and the cost of Hy3preview has significantly decreased compared to the previous generation model.

On the Tencent Cloud large model service platform TokenHub, the input price of Hy3preview is as low as 1.2 yuan per million tokens, the input cache price is 0.4 yuan per million tokens, and the output price is as low as 4 yuan per million tokens. At the same time, Tencent Cloud has jointly launched a customized Hy3preview Token Plan package with Huan Yuan, with the personal edition priced as low as 28 yuan per month, providing a more cost-effective choice for agent development and building "Lobster" applications.

image.png

image.png