Recently, OpenAI has comprehensively upgraded the voice function of its ChatGPT, especially for subscribers, with the aim of enhancing the naturalness and emotional richness of AI's voice expression. According to OpenAI, the updated "Advanced Voice Mode" can now achieve more fluent and emotionally nuanced voice output, including improvements in intonation, pauses, and expressing empathy or sarcasm.

Audio sound waves intelligent voice

Source Note: The image is generated by AI, and the image authorization service provider is Midjourney.

It is worth noting that this update also introduced a real-time translation feature. Users can now select specific language pairs and request ChatGPT to perform translations. AI will continuously translate content between both parties until the user indicates to stop. This function is particularly useful for restaurant ordering or multi-language work scenarios.

Paid users only need to click on the language icon in the chat interface to experience these voice improvements across all platforms. However, OpenAI also noted some known issues. Users may occasionally encounter audio quality degradation, such as sudden changes in pitch or volume, which may be more noticeable in certain voices. Additionally, the so-called "hallucination" phenomenon still exists, where ChatGPT sometimes generates strange sounds for no apparent reason, such as ad snippets, random noise, or even background music. Recently, some users reported that ChatGPT suddenly played an advertisement during a conversation, even though OpenAI does not run ads.

image.png

OpenAI first launched the "Advanced Voice Mode" in May 2024 and expanded its availability in the EU in October 2024. The goal of this feature is to enable natural real-time interaction with AI, including interruptions during conversations and emotional expression. If the user turns on the camera, ChatGPT can also provide real-time commentary on surrounding objects or environments. Similar features are also available in Google's Gemini app.

Key Points:  

🌟 OpenAI has upgraded ChatGPT's voice function, making it more natural and fluent with richer emotional expression.  

🌍 A new real-time translation feature has been added, allowing users to select language pairs for continuous translation, suitable for multi-language scenarios.  

⚠️ Some problems still exist, including fluctuations in audio quality and the generation of strange sounds for no reason.