Recently, a research team from Shanghai Jiao Tong University has introduced a new dataset called Gen3DHF, which focuses on evaluating the quality of AI-generated 3D faces. With the rapid development of generative artificial intelligence, generating 3D faces has become possible, and it has a wide range of applications, especially in virtual reality. However, assessing the quality and realism of these generated 3D faces remains a significant challenge, as human perception of facial features is often subjective and sensitive.
The Gen3DHF dataset is a large-scale benchmark dataset that includes 2000 AI-generated 3D face videos, as well as 4000 mean opinion scores (MOS) collected from two dimensions: quality and realism, 2000 distortion perception saliency maps, and distortion descriptions. This dataset provides researchers with a valuable tool to objectively evaluate the quality of AI-generated content.
On this basis, the research team also proposed LMME3DHF, a 3D face evaluation metric based on a large multimodal model. It can effectively predict quality and realism scores and perform distortion perception visual question answering (VQA) and saliency prediction. Experimental results show that LMME3DHF achieves state-of-the-art performance in accuracy, not only surpassing existing methods but also showing high consistency with human perception judgments.
The team pointed out that although AI-generated 3D faces have made significant improvements in generation capabilities, issues such as perceptual distortions and non-realistic artifacts still exist, failing to meet human quality expectations. Although human evaluations provide important insights, they are costly and inefficient, making the development of an objective quality measurement standard essential.
The release of the Gen3DHF dataset fills the gap in existing methods for evaluating AI-generated 3D faces, particularly regarding the uniqueness of facial distortions. Through the evaluation of diverse 3D face video samples, the research team has made significant progress in both quality and realism. This not only helps enhance the credibility of generation technology but will also promote the development of virtual reality and related fields.
Paper: https://arxiv.org/pdf/2504.20466
Key Points:
🌟 The Gen3DHF dataset contains 2000 AI-generated 3D face videos, providing a foundation for quality assessment.
🤖 The LMME3DHF evaluation metric performs excellently in distortion perception and realism prediction, surpassing existing methods.
🔍 The study aims to fill the gap in the evaluation of AI-generated 3D faces, enhancing the reliability of the technology.