University of Pennsylvania Study Finds: The Ruder the AI is, the Higher the Accuracy Rate

AIbase基地

Published inAI News · 5 min read · Oct 15, 2025

The latest research paper from Pennsylvania State University, "Mind Your Tone," reveals a counterintuitive phenomenon: using direct or even rude tones when interacting with large language models may yield more accurate answers than using polite language. This study is the first to systematically verify the actual impact of question tone on AI model performance.

The research team constructed a test set containing 50 medium-difficulty multiple-choice questions covering various fields such as mathematics, science, and history. For each question, researchers designed five different questioning styles, ranging from polite expressions like "Could you kindly help me solve this problem?" to neutral statements like "Please answer this question," to concise instructions like "Just give the answer," and even aggressive expressions like "If you're not stupid, answer this" and "You're useless, can you solve this?"

The test subject was OpenAI's latest GPT-4o model. To ensure the independence of the experiment, the researchers asked the model to forget previous conversation content and only output the letter of the option as the answer. The statistical results showed that when using rude tone questions, the accuracy rate of GPT-4o reached 84.8%, while overly polite questioning methods actually reduced the accuracy rate to 80.8%, a difference of four percentage points.

The research team explained this phenomenon by stating that overly polite expressions often contain a lot of small talk and descriptive language, which are irrelevant to the core issue and actually interfere with the model's ability to extract key information. In contrast, direct command-style expressions, although lacking in politeness, allow the model to focus more on the question itself, reducing noise in the information processing process.

It is worth noting that this rule does not apply universally to all AI models. Comparative tests conducted on earlier models such as GPT-3.5 and Llama2-70B showed that these models responded better to polite questioning, and rude tone actually lowered the quality of the response. Researchers speculate that new generation models have been exposed to more diverse tone data during training, giving them stronger abilities to filter out irrelevant information, allowing them to maintain or even improve performance in non-polite contexts.

Although the experimental results provide interesting technical insights, from a practical application perspective, users should still adjust their interaction methods based on the specific characteristics of the model and scenario needs when using AI tools daily. The more important significance of this study lies in reminding developers and users that prompt design is not just about politeness, but also about information density and instruction clarity.

Alibaba Tongyi Qianwen Launches Qwen3-VL Lightweight Model: 4B and 8B Parameter Versions Performance Approaches Previous 72B Flagship

The Alibaba Tongyi Qianwen team has launched two lightweight models in the Qwen3-VL series, with parameter scales of 4B and 8B. This series is the strongest family of vision-language models to date, adding small-parameter versions to lower deployment barriers while maintaining strong performance. Each scale offers two versions: instruction following and chain-of-thought reasoning, providing developers with more flexible options.

GPT-5 Pro Recovers Forgotten Mathematical Answers: Erdos Problem #339 Was Proven as Early as 2003

OpenAI's GPT-5Pro identified a proof paper for Erdos Problem #339 as early as 2003 through screenshot recognition, a discovery that shocked the mathematics community. This number theory problem was proposed by Paul Erdos, focusing on whether it is possible to cover specific mathematical properties using r elements from a set A of natural numbers when A is an r-th order basis. The resolution of this 22-year-old mystery highlights the breakthrough potential of AI in academic research.

Meta Super Intelligence Lab Breaks RAG Technology Bottleneck: The REFRAG Framework Boosts Inference Speed by 30 Times

Meta Super Intelligence Lab introduced the REFRAG technology, which improves the inference speed of large language models in retrieval-augmented generation tasks by more than 30 times. This breakthrough result was published in a related paper and profoundly transforms the way AI models operate. The lab was established in California in June this year, stemming from Zuckerberg's emphasis on the Llama4 model.

Latest Domestic Direct Connection Sora2 No Watermark Free Usage Tutorial

OpenAI released Sora2, with over a million downloads in five days, topping the App Store free chart, with a growth rate surpassing GPT. Compared to its predecessor, the text understanding capability has significantly improved, enabling the generation of complete videos with synchronized audio and video based on simple prompt words, without the need for manual voiceover or music, suitable for short videos, advertisements, short plays, MVs, and animation production.

Malaysia Enters a New Era of AI, ChatGPT Go Aids Digital Transformation

OpenAI launches the ChatGPT Go subscription service in Malaysia, with a monthly fee of approximately $9.25, significantly lowering the barrier to AI usage. The service includes the GPT-5 model and features such as image generation, file upload, and memory, enhancing the user experience. This move aims to attract the rapidly growing mid-tier users and students in the region.

Cherry Studio Launches CherryIN, Fully Integrating Mainstream AI Models

Cherry Studio released version v1.6.4, integrating the new CherryIN system. This system combines mainstream AI models such as Claude, Gemini, GPT, GLM, Grok, Kimi, and Qwen, covering multiple versions. Users can easily access and use various AI models through CherryIN, significantly enhancing the user experience.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

University of Pennsylvania Study Finds: The Ruder the AI is, the Higher the Accuracy Rate

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Alibaba Tongyi Qianwen Launches Qwen3-VL Lightweight Model: 4B and 8B Parameter Versions Performance Approaches Previous 72B Flagship

GPT-5 Pro Recovers Forgotten Mathematical Answers: Erdos Problem #339 Was Proven as Early as 2003

HKU and Meituan Collaborate to Solve AI Math Challenges: CodePlot-CoT Enables Large Models to Think with Code Plotting, Performance Surges by 21%

Meta Super Intelligence Lab Breaks RAG Technology Bottleneck: The REFRAG Framework Boosts Inference Speed by 30 Times

Build a Custom ChatGPT with $100: AI Expert Opens Source nanochat Teaching Tool - Learn to Create a Chatbot in 4 Hours from Scratch

Latest Domestic Direct Connection Sora2 No Watermark Free Usage Tutorial

Malaysia Enters a New Era of AI, ChatGPT Go Aids Digital Transformation

Top 10 Engineering Achievements of the World in 2025 Revealed: DeepSeek Selected

Say Goodbye to Random Chart Drawing, Hong Kong Chinese Team Launches the First Structured Image Generation System!

Cherry Studio Launches CherryIN, Fully Integrating Mainstream AI Models

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

University of Pennsylvania Study Finds: The Ruder the AI is, the Higher the Accuracy Rate

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Alibaba Tongyi Qianwen Launches Qwen3-VL Lightweight Model: 4B and 8B Parameter Versions Performance Approaches Previous 72B Flagship

GPT-5 Pro Recovers Forgotten Mathematical Answers: Erdos Problem #339 Was Proven as Early as 2003

HKU and Meituan Collaborate to Solve AI Math Challenges: CodePlot-CoT Enables Large Models to Think with Code Plotting, Performance Surges by 21%

Meta Super Intelligence Lab Breaks RAG Technology Bottleneck: The REFRAG Framework Boosts Inference Speed by 30 Times

Build a Custom ChatGPT with $100: AI Expert Opens Source nanochat Teaching Tool - Learn to Create a Chatbot in 4 Hours from Scratch

Latest Domestic Direct Connection Sora2 No Watermark Free Usage Tutorial

Malaysia Enters a New Era of AI, ChatGPT Go Aids Digital Transformation

Top 10 Engineering Achievements of the World in 2025 Revealed: DeepSeek Selected

Say Goodbye to Random Chart Drawing, Hong Kong Chinese Team Launches the First Structured Image Generation System!

Cherry Studio Launches CherryIN, Fully Integrating Mainstream AI Models

GEO Services