Tencent has officially released and open-sourced a new member of the HuanYuan large model family - the HuanYuan-A13B model. The model adopts an expert mixture (MoE) architecture, with a total parameter scale of 80 billion and activated parameters of 13 billion. It maintains the performance of top open-source models while significantly reducing inference latency and computational costs, providing a more cost-effective AI solution for individual developers and small and medium-sized enterprises.
According to Tencent, the HuanYuan-A13B model can be deployed with just one mid-range GPU card under extreme conditions. Users can download and use the model through technical communities such as Github and HuggingFace, and the model API is already available on the Tencent Cloud official website. This feature allows more developers to access cutting-edge AI technology at a low cost, promoting the implementation of innovative applications.
In terms of performance, the HuanYuan-A13B model demonstrates leading results in mathematical, scientific, and logical reasoning tasks. For example, in mathematical reasoning tests, the model can accurately complete decimal comparisons and demonstrate step-by-step analysis capabilities. In addition, the model supports calling tools to generate complex instruction responses, such as travel guides and data file analysis, providing strong support for intelligent agent (Agent) application development.
In terms of technology, the HuanYuan-A13B model improves the upper limit of model reasoning ability by using a high-quality web token corpus of 20 trillion during pre-training, and it also completes the Scaling Law theory system of the MoE architecture, providing quantifiable engineering guidance for model design. At the same time, the model supports users in selecting thinking modes as needed. The fast thinking mode provides concise and efficient output, while the slow thinking mode involves deeper reasoning steps, balancing efficiency and accuracy.
To further promote the development of the AI open-source ecosystem, Tencent has also open-sourced two new datasets. Among them, ArtifactsBench is mainly used for code evaluation, building a new benchmark containing 1825 tasks; C3-Bench is designed for agent scenario model evaluation, with 1024 test data to identify shortcomings in model capabilities.
The open-sourcing of the HuanYuan-A13B model is another achievement of Tencent's continuous investment in the AI field. In the future, the HuanYuan large model family will launch more models of various sizes and characteristics, sharing practical technologies with the community and jointly promoting the prosperity of the open-source ecosystem.
Experience entry: https://hunyuan.tencent.com/
Open source address: https://github.com/Tencent-Hunyuan