Panasonic Holdings Corporation (Panasonic HD), in collaboration with researchers from Panasonic Research and Development America (PRDCA) and the University of California, Los Angeles (UCLA), has successfully developed a multimodal generative AI named "OmniFlow." The standout feature of this technology is its "anything-to-anything" generation capability, enabling seamless conversion between text, images, and audio. This significantly enhances the application potential of multimodal generative AI.

image.png

In recent years, research on multimodal generative AI, especially those integrating audio, has gained increasing attention. However, traditional methods face limitations in data acquisition, particularly when processing text, image, and audio data simultaneously. This results in a substantial increase in training data volume and costs. To address this challenge, OmniFlow combines different generative AIs tailored for various data formats (such as text-audio and text-image) flexibly. Even with small sample sizes, it can learn high-precision "anything-to-anything" models, thereby significantly reducing data collection costs.

image.png

Omniflow's technological innovation has received international recognition and will be presented at the 2025 Conference on Computer Vision and Pattern Recognition (CVPR). The core of this technology lies in its ability to connect and process three distinct data characteristics, learning more complex data relationships rather than simply averaging input data. This approach allows Ominflow not only to retain the features of each modality during generation but also enhances expressive power.

image.png

In evaluation experiments, OmniFlow outperformed other traditional methods in tasks such as "text-to-image" and "text-to-audio" generation, demonstrating the best performance. The experimental results showed that compared to other "anything-to-anything" generative methods, the training data volume required by OmniFlow could be reduced to 1/60th, a significant advantage that sets it apart in the field of multimodal AI.

Looking ahead, OmniFlow is expected to be applied in fields such as factories and lifestyles, generating specialized data tailored to specific scenarios. Panasonic Holdings will continue to promote the social application of AI, striving to develop technologies that bring convenience to customers' lives and work.