Over the past two years, Text-to-Image (T2I) models have rapidly evolved, achieving high-quality, diverse, and creative image generation. Inspired by DALLE3, researchers have introduced the Interactive Text-to-Image (iT2I) task, allowing natural language interaction with large language models to facilitate high-quality image generation and question answering. They employed a straightforward approach, leveraging prompt techniques and off-the-shelf T2I models to extend large language models for iT2I without the need for additional training. This work offers new insights into human-computer interaction experiences and is of significant importance for enhancing the image quality of next-generation T2I models. Concurrently, Microsoft BingChat has integrated OpenAI's new image generation tool, DALL-E3, to provide superior image processing capabilities.