Recently, the "Ultra-fast" model of Metasumo AI Search has been officially launched, providing users with a more efficient and precise search experience.

The Metasumo AI Search team successfully achieved a response speed of up to 400 tokens/second on a single H800 GPU by implementing kernel fusion technology on GPUs and dynamic compilation optimization strategies on CPUs. Most questions can be answered within 2 seconds.

WeChat_Screenshot_20250527090933.png

To allow users to truly feel the speed of the new model, Metasumo AI Search specially set up a speed testing site (kuai.metaso.cn), where users can input questions at any time to personally experience the rapid response brought by the new model.

The "Ultra-fast" model launched this time by Metasumo AI Search is expected to trigger a new round of technological innovation in the AI search field due to its excellent speed, accuracy, and logic, providing users with higher-quality search services.