Emu is a quality control tool designed to enhance the aesthetic quality of image generation models. It leverages fine-tuning with a limited number of high-quality images to significantly improve generation quality. Emu has been pre-trained on 110 million image-text pairs and fine-tuned using thousands of carefully selected high-quality images. Compared to models trained only with pre-training, Emu achieves an 82.9% win rate. In terms of visual appeal preference, Emu outperforms the state-of-the-art SDXLv1.0 with respective scores of 68.4% and 71.3%. Emu can also be applied to other architectures, including pixel diffusion and masked generative transformer models.