Kandinsky Deforum is a text-to-image generation model based on the extension of Kandinsky and the characteristics of Deforum. This model can convert text into video with efficient, fast, and accurate characteristics. Its core methods include generating reference frames, making small changes to the previous frame, and using image-to-image methods for diffusion processing of the resulting images. The advantage of Kandinsky Deforum lies in its ability to generate high-quality videos while possessing good scalability and flexibility. This product is positioned to provide users with an efficient, fast, and accurate text-to-image generation model.