Recently, Shengshu Technology, a leading company in the multimodal AI field, announced the successful completion of a round of A-series financing worth hundreds of millions of RMB. This round was led by Boc Capital, with continued participation from existing investors such as Baidu's strategic investment arm and the Beijing Artificial Intelligence Industry Investment Fund, demonstrating strong market recognition of Shengshu Technology. The company plans to use this funding to further promote model development and technological innovation, explore the potential of multimodal large models, and accelerate product expansion and user services.

Multimodal technology, especially in the field of video generation, is currently experiencing rapid development. The financing representative of Shengshu Technology stated that it is expected that within the next three years, multimodal generation will change the way digital content is produced globally, gradually penetrating various industries. In this context, Shengshu Technology's video large model Vidu, launched in 2023, has performed well, achieving over 20 million USD in annual recurring revenue within just 8 months, and generating more than 400 million videos worldwide.

The success of Vidu is not only reflected in its revenue, but also in the wide range of commercial applications. Shengshu Technology has established partnerships with well-known enterprises such as JD.com and Amazon, covering multiple industry scenarios including advertising, e-commerce, film and television promotion, and animation production. These cooperative relationships not only validate Shengshu Technology's technical capabilities, but also mark the further maturity of video generation in commercial applications.

As technology develops, video generation is increasingly seen as one of the most challenging areas in multimodal AI. According to the financing representative of Shengshu Technology, video generation capabilities are expected to continue improving in the coming years, moving towards high controllability, high consistency, and long context. Moreover, the realization of real-time generation and editing functions will make video generation more flexible and efficient.

In terms of the overall industry environment, with the decline in GPU prices and the advancement of domestic computing power, the cost of video generation is expected to significantly decrease, thus accelerating the commercial penetration at the enterprise level. However, while the industry is developing rapidly, it also faces challenges such as copyright management and regulation of false information. Enterprises need to take early steps in compliance and content identification.