Microsoft is fully advancing its artificial intelligence strategy, integrating Copilot as a core component of Windows 11, and launching a series of generative AI features aimed at fundamentally changing the way users interact with their PCs. The new features include voice control, screen content analysis, and limited local automation agents, with the goal of making Copilot the "primary way" for users to interact with their PCs.
"Hey, Copilot": Voice control becomes mainstream
Users can now activate Copilot on Windows PCs by saying "Hey, Copilot". This feature is optional, and after activation, a microphone icon and audio signal will appear. Microsoft claims that users who interact with Copilot via voice engage twice as often as those who use text, indicating higher user engagement with voice interaction. Sessions can end automatically or through the "goodbye" command.
For easy access, Copilot will be added to the Windows taskbar, including a new button for voice and visual tools, allowing users to access Windows settings via voice commands. Microsoft also plans to offer a pure text option to meet the needs of users who want to interact without speaking.
Copilot Vision: Real-time screen analysis and contextual help
The "Copilot Vision" feature is now available globally on all devices supporting Copilot. This feature can analyze content on the screen and provide contextually relevant help while users are editing photos, playing games, or working in Office. Microsoft states that Copilot Vision can process and analyze entire documents in Word, Excel, and PowerPoint, providing strong productivity support.
"Copilot Actions": Limited autonomous local agents
Microsoft has also launched a test feature called "Copilot Actions", which allows Copilot to handle some simple tasks on the local computer, such as searching PDFs or organizing photos. This feature will first be released as a preview in Copilot Labs for Insider Program users. Users can observe Copilot's work at any time and intervene when necessary.
Microsoft acknowledges that these agent-based features are currently limited and may have errors, and they cannot reliably control complex software. Given studies showing that agent-based AI systems may pose security risks, Microsoft says that user testing will be key to improving reliability.
Collaboration with Chinese startup Manus: System-level local file access
Additionally, Microsoft has integrated with the Chinese startup Manus. According to reports, the Manus agent uses the Anthropic model, built into Windows File Explorer, allowing users to create websites from local files by simply right-clicking. Manus uses a "model context protocol" designed by Anthropic to enable system-wide access to local content.
Expansion of online services and gaming applications
Copilot Connectors allow users to extract data from connected services such as OneDrive, Outlook, Gmail, or Google Drive. Users can use prompts like "Find my dentist appointment details" to search for appointments, contacts, or documents and import the results directly into Word, Excel, or PowerPoint.
In the gaming sector, "Gaming Copilot" is now available on Asus' Windows handheld device ROG Xbox Ally. Users can press a button to activate it and get help navigating or explaining game mechanics in games like Minecraft without pausing the game.
Most of the new features can run on any Windows 11 device that supports Copilot, but Microsoft recommends using Copilot+ PCs to improve processing speed and perform local AI tasks. Only specific "Click to Do" Zoom integrations explicitly require using Copilot+ PCs and participating in the Windows Insider Program.