Recently, a research team from the University of Zurich, the University of Amsterdam, Duke University, and New York University released a new study revealing the shortcomings of large language models in identifying content on social media posts. The study shows that AI-generated posts on major social platforms are easily identified by humans, with an accuracy rate of 70% to 80%, far higher than random guessing.

Robot Artificial Intelligence AI (6) Cyberpunk Metaverse

Image source note: Image generated by AI

The researchers tested nine different large language models, including Apertus, DeepSeek, Gemma, Llama, Mistral, Qwen, etc., analyzing their performance on Bluesky, Reddit, and X platforms. The results showed significant differences in "toxicity scores" of AI-generated content, which became an important factor in distinguishing AI-generated posts from human ones. In other words, if there are particularly sharp or humorous comments under a post, it is very likely written by a human user.

The study points out that although large language models can imitate the form of online conversations, they struggle significantly in capturing emotional expressions. Spontaneous and emotionally rich expressions are characteristics of human social interactions, and AI performs much worse than humans in this aspect. Additionally, the study found that in certain situations, such as expressing positive emotions on Elon Musk's X platform or discussing politics on Reddit, AI models performed particularly poorly.

Overall, the AI models involved in the test performed better in imitating X platform posts, while performing slightly worse on Bluesky, and Reddit was the most challenging among the three, as the conversation norms on this platform were more complex. At the same time, the study found that some AI models not fine-tuned by human instructions performed better, indicating that excessive training might make the model's style too consistent, leading to more mechanical content.

Through this study, the researchers emphasized the limitations of AI in emotional expression, and future applications of AI on social media still need to continuously improve AI's emotional intelligence.

Key points:

🌟 The study shows that the identification rate of AI-generated social media content is as high as 70%-80%.  

🤖 Large language models have obvious shortcomings in emotional expression, spontaneous emotional interaction remains unique to humans.  

📊 Models not fine-tuned by human instructions performed better in the tests, excessive calibration training may lead to mechanical content.