Translated data: NVIDIA has introduced a new text-to-image model named ConsiStory, designed to address the challenge of generating coherent images. Through innovative methods and modules, such as SDSA and feature injection, this model achieves both subject coherence and detail consistency, circumventing the training costs associated with traditional methods. The launch of ConsiStory opens up new possibilities for the advancement of text-to-image models.