After releasing the Kimi 2.7 Code large model last week, Moonshot AI has once again made a breakthrough tonight, officially launching a high-speed version of this model. This version achieves a revolutionary breakthrough in output speed while maintaining the core capabilities of the original model, aiming to significantly improve developers' code writing efficiency.

image.png

Extreme Speed Dramatically Reduces Waiting Time

According to official data, the output speed of the high-speed version of Kimi 2.7 Code is 5 to 6 times faster than the regular version. In regular programming scenarios, its output speed is about 180 Token/s; while in specific scenarios with short context, the maximum speed can jump up to 260 Token/s.

In terms of pricing strategy, the price of the high-speed version API is only twice that of the regular version, achieving several times the efficiency improvement with a lower premium. Currently, Kimi Code Plan users can join the early access program to experience this ultimate programming experience in Kimi Code first.

Thinking Mode Becomes Key to Performance Release

To ensure the best performance output of this model, users must enable the thinking mode when using the Kimi 2.7 Code series models. The Kimi API and Kimi Code have already enabled this feature by default. If users manually disable it, the API will directly throw an error, and the system will automatically revert to the K2.6 version.

As computing power continues to expand, Moonshot AI will further expand the availability of this high-speed model. It is expected that from July this year, this version will be gradually opened to members at the Allegretto level and above. At that time, the usage cost in Kimi Code Plan will be adjusted to three times that of the regular version.