Information

Latest AI News

Explore AI Frontiers, Master Industry Trends

AI Daily Brief

Your Daily AI Brief - Never Miss What's Next

Information

AI Product Finder

Smart Product Discovery - Comprehensive Market Intelligence

AI Product Rankings

AI Product Power Rankings - Performance, Buzz & Trends

AI Product Submit

Submit Your AI Product - Amplify Reach & Drive Growth

Tools

AI Tools Directory

Discover The Best AI Websites & Tools

Information

AI Models Finder

Comprehensive AI Models Collection for All Your Development & Research Needs

LLM Leaderboard

AI LLM Power Rankings - Performance, Buzz & Trends

Model Providers

Discover Trusted AI Model Partners - Guaranteed Reliable Support

Tools

Compare LLMs

Multi-Dimensional Large Model Comparison - Find Your Perfect Match

LLM Cost Calculator

Calculate AI Model Costs Accurately - Optimize Your Budget

LLM Arena

Multi-Model Real-Time Evaluation & Quick Output Comparison

Information

MCP Servers

Discover Popular AI-MCP Services - Find Your Perfect Match Instantly

MCP Client

Easy MCP Client Integration - Access Powerful AI Capabilities

MCP Case Tutorials

Master MCP Usage - From Beginner to Expert

MCP Ranking

Top MCP Service Performance Rankings - Find Your Best Choice

MCP Service Submission

Publish & Promote Your MCP Services

Tools

MCP Playground

Test MCP Services Freely - Quick Online Experience

MCP Inspector

Quick MCP Service Testing - Fast Deployment

Tools

GEO Brand Visibility

All-in-One GEO Brand Insights Platform

AI Brand Monitoring Tool

Analyze & Track How AI Models Cite Your Brand

AI Search Visibility Checker

Detect brand's visibility on AI platforms

GEO Promotion Link Detection

Quickly evaluate the citation of promotion articles on AI platforms

Service

GEO Ranking Optimization System

Own your own GEO system and become a professional GEO optimization service provider.

GEO Services

Achieve Dominant Visibility in AI Search for Your Business or Brand with GEO Services

Tools

AI Model Compatibility Checker

Free PC Hardware Test for DeepSeek & Llama

AI Deployment Calculator

Enter Your Large Model Computing Requirements for Instant GPU, Memory & Server Configuration Recommendations

AI Tutorial

Salesforce and the University of Southern California Launch CoAct-1: A Hybrid Approach of Code + GUI to Take AI Agent Automation to New Heights

AIbase基地

Published inAI News · 7 min read · Aug 13, 2025

Salesforce has developed a breakthrough technology called CoAct-1 in collaboration with researchers from the University of Southern California. This technology aims to significantly enhance the ability of AI agents to perform complex tasks on computers by combining the advantages of coding and graphical user interface (GUI) operations. This hybrid approach is designed to overcome the fragility of traditional GUI agents, paving the way for more powerful and scalable automation.

AI Music Artificial Intelligence (3)

Pain Points of Traditional AI Agents: Long Tasks and Misclicks

Current computer-based AI agents typically rely on visual language models (VLMs) to perceive the screen and simulate mouse and keyboard operations. While these "click-based" agents can perform various tasks, they often struggle when dealing with applications that have dense menus and complex workflows, such as office productivity suites. Researchers point out that a single misclick or misunderstanding of a UI element can cause an entire task to fail in these scenarios.

To address this challenge, researchers have tried using advanced planners to enhance GUI agents, but this approach still fails to solve operations that can be completed more directly and reliably with just a few lines of code.

CoAct-1: A Hybrid System Based on Multi-Agent Collaboration

To overcome these limitations, the CoAct-1 system was developed. Its core idea is to "combine the intuitive benefits of GUI operations with the precision, reliability, and efficiency of direct system interaction through code." The system consists of a team of three specialized agents working together to complete tasks:

Orchestrator: As the central planner, it is responsible for breaking down the user's overall goal into subtasks and assigning them to the most suitable agent.
Programmer: Responsible for writing and executing Python or Bash scripts, handling backend operations such as file management or data processing.
GUI Operator: Based on VLM, it specializes in front-end tasks that require clicking buttons or navigating interfaces.

This dynamic delegation mechanism allows CoAct-1 to strategically bypass inefficient GUI operations and instead use more robust and efficient code execution, while retaining the necessity of visual interaction. The entire workflow is iterative, with each agent reporting back to the orchestrator after completing a subtask, which then determines the next step.

Performance Leap: Faster and More Efficient

Researchers tested CoAct-1 on the OSWorld benchmark, which includes 369 real-world tasks across browsers, IDEs, and office applications. The results showed that CoAct-1 achieved a 60.76% success rate, setting a new record.

Especially in operating system-level tasks and multi-application workflows, CoAct-1 demonstrated significant performance improvements. More importantly, the system's efficiency has also increased dramatically, with an average of only 10.15 steps required to complete a task, far fewer than the 15.22 steps needed by other leading pure GUI agents. Researchers note that fewer steps not only speed up task completion but also minimize the chance of errors, resulting in more efficient and reliable automation.

From Lab to Enterprise: Potential Applications and Challenges

This technology has tremendous potential for enterprise applications. Ran Xu, Director of AI Research at Salesforce, pointed out that customer support, sales prospecting, automated bookkeeping, and marketing campaign management are perfect use cases. In these scenarios, companies need to handle a variety of tools with and without APIs, and CoAct-1 can flexibly utilize code and screens to provide comprehensive automation solutions.

However, transitioning CoAct-1 from the lab to enterprise environments also presents challenges, including dealing with legacy software, ensuring security, and the necessity of human supervision. Xu emphasized that training agents in sandbox environments is needed to improve their adaptability and establish strong access control and security safeguards to prevent malicious code execution. Ultimately, the "human-in-the-loop" model will be key to ensuring the safe and reliable operation of the agent in the foreseeable future.

CoAct-1 AIAgent GUIOperation Salesforce

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Valuation Surges Nearly Twice in 4 Months! AI Chip Star Cerebras Secures $1 Billion Series H Funding

Cerebras completes $1 billion Series H funding, with valuation soaring to $23 billion. The round is led by Tiger Global, with AMD as a strategic investor. Just four months after the previous valuation of $8.1 billion, the growth has been rapid.

Feb 5, 2026

Trillion-Parameter Peak: Shanghai AI Lab Opens Source the World's Largest Scientific Multimodal Model Intern-S1-Pro

Shanghai Artificial Intelligence Laboratory has released and open-sourced the trillion-parameter scientific multimodal large model ShuRen Intern-S1-Pro, based on the "Integration of General and Specialized" architecture SAGE. It sets a new record for parameter scale in the open-source community and achieves breakthroughs in multiple scientific capabilities, maintaining a leading position in international academic evaluations in the AI4S field.

Feb 5, 2026

120

Shanghai AI Lab Releases the Scientific Multimodal Model Intern-S1-Pro

Shanghai AI Lab launches 'Intern-S1-Pro', the world's largest open-source scientific multimodal model with 1 trillion parameters. Built on the SAGE architecture, it excels in complex scientific reasoning, achieving top-tier performance in global benchmarks.....

Feb 5, 2026

Avoiding Discussion of the $1 Billion Apple Deal: Alphabet's Earnings Call Takes a Cold Approach to AI Collaboration Details, Behind the Scenes Lies Monetization Anxiety

Alphabet's Q4 earnings call avoided details on the Apple-Siri partnership, highlighting the sensitivity of tech giants' AI monetization strategies.....

Feb 5, 2026

100

Kunlun Tian Gong Releases Skywork Desktop Version: Your Windows Computer Can Now Hire AI Employees

Kunlun Tian Gong launches 'Skywork Desktop Edition', upgrading AI assistants from web chat to system-level proactive collaboration tools. It supports local execution, eliminating the need to upload sensitive files to the cloud, directly accessing and processing local computer files for a safer, more efficient AI collaboration experience for Windows users.....

Feb 4, 2026

120

14 Days to Break 1 Million Downloads! Zhipu GLM-4.7-Flash Leads the Open Source Large Model SOTA

Two weeks after the release of Zhipu AI's open-source model GLM-4.7-Flash, its download count on Hugging Face exceeded 1 million. This 30B-A3B hybrid thinking model delivers strong performance, outperforming gpt-oss-20b and Qwen3-30B-A3B-Thinking-2507 in tests such as SWE-bench Verified and τ²-Bench, leading among models of the same size.

Feb 4, 2026

1 Billion Red Envelope Crash WeChat Rules, Tencent Yuan Coupon Red Envelopes Launched Urgently

Tencent Yuanbao's WeChat red envelope links were blocked, prompting a swift launch of 'password red envelopes' as an alternative. For the Spring Festival market, Tencent Yuanbao initiated a '1 billion CNY Spring Festival red envelope' campaign, endorsed by Ma Huateng, where users can increase their lottery chances by inviting others via shared links.....

Feb 4, 2026

110

Beware! Popular AI Agent OpenClaw Exposed with Critical Vulnerability, macOS Users Face Risk of Virus Injection

OpenClaw, a trending AI tool for automation, faces security flaws allowing attackers to exploit its skill files to target macOS users with malware, compromising its intended email-handling functions.....

Feb 4, 2026

150

Zhipu Launches 0.9B Lightweight GLM-OCR: Performance Leads the Market, Processing 1,000 Tasks for Just $0.1

Zhipu open-sources the professional OCR model GLM-OCR, achieving a cross-level performance breakthrough with only 0.9B parameters. It secured first place with 94.6 points on the OmniDocBench V1.5 leaderboard, approaching the general large model Gemini-3-Pro, effectively solving the pain points of complex document parsing.

Feb 3, 2026

310

Revolutionary AI Medical Large Model Xihe No.1 Makes Its Debut

Peking University Third Hospital and other institutions jointly launched the AI medical large model "Xihe No.1", which uses billions of parameters and millions of medical records, with a medical knowledge coverage rate of 98% and an accuracy rate exceeding 90%, aiming to enhance the level of intelligent medical services.

Feb 2, 2026

110

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

Salesforce and the University of Southern California Launch CoAct-1: A Hybrid Approach of Code + GUI to Take AI Agent Automation to New Heights

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Valuation Surges Nearly Twice in 4 Months! AI Chip Star Cerebras Secures $1 Billion Series H Funding

Trillion-Parameter Peak: Shanghai AI Lab Opens Source the World's Largest Scientific Multimodal Model Intern-S1-Pro

Shanghai AI Lab Releases the Scientific Multimodal Model Intern-S1-Pro

Avoiding Discussion of the $1 Billion Apple Deal: Alphabet's Earnings Call Takes a Cold Approach to AI Collaboration Details, Behind the Scenes Lies Monetization Anxiety

Kunlun Tian Gong Releases Skywork Desktop Version: Your Windows Computer Can Now Hire AI Employees

14 Days to Break 1 Million Downloads! Zhipu GLM-4.7-Flash Leads the Open Source Large Model SOTA

1 Billion Red Envelope Crash WeChat Rules, Tencent Yuan Coupon Red Envelopes Launched Urgently

Beware! Popular AI Agent OpenClaw Exposed with Critical Vulnerability, macOS Users Face Risk of Virus Injection

Zhipu Launches 0.9B Lightweight GLM-OCR: Performance Leads the Market, Processing 1,000 Tasks for Just $0.1

Revolutionary AI Medical Large Model Xihe No.1 Makes Its Debut

AI News Recommendations

Valuation Surges Nearly Twice in 4 Months! AI Chip Star Cerebras Secures $1 Billion Series H Funding

Trillion-Parameter Peak: Shanghai AI Lab Opens Source the World's Largest Scientific Multimodal Model Intern-S1-Pro

Shanghai AI Lab Releases the Scientific Multimodal Model Intern-S1-Pro

Avoiding Discussion of the $1 Billion Apple Deal: Alphabet's Earnings Call Takes a Cold Approach to AI Collaboration Details, Behind the Scenes Lies Monetization Anxiety

Kunlun Tian Gong Releases Skywork Desktop Version: Your Windows Computer Can Now Hire AI Employees

14 Days to Break 1 Million Downloads! Zhipu GLM-4.7-Flash Leads the Open Source Large Model SOTA

1 Billion Red Envelope Crash WeChat Rules, Tencent Yuan Coupon Red Envelopes Launched Urgently

Beware! Popular AI Agent OpenClaw Exposed with Critical Vulnerability, macOS Users Face Risk of Virus Injection

Zhipu Launches 0.9B Lightweight GLM-OCR: Performance Leads the Market, Processing 1,000 Tasks for Just $0.1

Revolutionary AI Medical Large Model Xihe No.1 Makes Its Debut

GEO Services