Microsoft Open Sources OmniParser V2: Enabling Large Language Models to 'Understand' and Operate GUI
Recently, Microsoft launched the upgraded version of its Windows operating model OmniParser - OmniParser-v2.0. This model can recognize desktop and window elements and interact with them, marking a significant step forward for AI Agent technology towards achieving fully automated computer use. The key capability of OmniParser-v2.0 lies in its awareness and interaction with desktop environments. This means that, in conjunction with this model, AI Agents can...