CustomVideo is a novel framework designed to generate videos that maintain identity under multiple topic guides. The product first encourages the co-occurrence of multiple topics, then designs a simple and effective attention control strategy based on a text-to-video diffusion model to unravel different topics in the diffusion model's latent space. Additionally, the product leverages object segmentation from provided reference images and provides corresponding object masks for attention learning, helping the model focus on specific object regions. Furthermore, they have curated a multi-topic text-to-video generation dataset as a comprehensive benchmark, encompassing 69 individual topics and 57 meaningful pairs. Abundant qualitative, quantitative, and user study results demonstrate that our approach outperforms previous state-of-the-art methods significantly.