Recently, the "Ultra-fast" model of Metasumo AI Search has been officially launched, providing users with a more efficient and precise search experience.
The Metasumo AI Search team successfully achieved a response speed of up to 400 tokens/second on a single H800 GPU by implementing kernel fusion technology on GPUs and dynamic compilation optimization strategies on CPUs. Most questions can be answered within 2 seconds.
To allow users to truly feel the speed of the new model, Metasumo AI Search specially set up a speed testing site (kuai.metaso.cn), where users can input questions at any time to personally experience the rapid response brought by the new model.
The "Ultra-fast" model launched this time by Metasumo AI Search is expected to trigger a new round of technological innovation in the AI search field due to its excellent speed, accuracy, and logic, providing users with higher-quality search services.