At the Baidu AI Day open day, Baidu Intelligent Cloud announced the launch of the world's first AI digital employees, covering core business functions such as marketing managers, loan assistants, auto sales, promotions specialists, product managers, course consultants, and recruitment specialists. These digital employees rely on Baidu Intelligent Cloud's leading AI full-stack capabilities, deeply integrating three key business advantages: large models, digital human technology, and industry know-how. They can be used out of the box, and once on the job, they are capable of delivering results. While precisely empowering vertical business scenarios, they continuously accumulate job capabilities, redefining enterprise-level intelligent service capabilities with the three characteristics of "understanding business, delivering results, and being evolutionary." This makes them a trustworthy digital business partner for enterprises.

During the open day, Baidu Vice President Yuanyu shared that the capabilities of large models are evolving at an unprecedented speed, and their boundaries are constantly expanding. Large model providers are starting to introduce world models based on multi-modal large model capabilities, building unique abilities such as dynamic modeling of the physical world, causal reasoning and counterfactual inference, long-term prediction and planning. The rapid evolution of large models is driving AI to transition from the Copilot form of human-computer collaboration to the Agent form with autonomous execution capabilities, further deepening toward the Agentic direction. In the Agentic era, more new types of "workers" will appear in daily work, and intelligent agents will participate in all aspects of enterprise operations in the form of "digital employees." As production units, digital employees are driving revolutionary changes in organizational productivity.

Zhi Zheng shared that what companies truly need are Agents that can carry KPIs and take responsibility for business outcomes. Baidu Intelligent Cloud has deeply integrated the ten years of dialogue capability accumulation of Ke Yue Smart Customer Service and the seven years of technical experience of Xiling Digital Humans; the Agent architecture based on large models not only achieves end-to-end intelligent interaction but also drives continuous business optimization and result generation, creating China's first batch of AI digital employees and achieving a qualitative transformation from function execution to business decision-making and result delivery.

In the recruitment industry, the volume of manual invitations and repetitive operations is huge, which is time-consuming and labor-intensive, and can lead to errors due to complicated processes. Recruitment consultants have focused on AI product positions, achieving full-process execution of outbound calls, scheduling interviews, and notifying results. During communication, it can sensitively detect interviewees' intentions and adjust conversation strategies in real-time, increasing the interview participation rate by 40% and allowing HR to focus on core talent evaluation.

For the education and training industry, after the course consultant "joins," it takes over everything from early enrollment consultation, service follow-up, to later renewal. It can handle repetitive tasks. In the service phase, it provides 7x24-hour online responses and accurately handles enrollment inquiries. In the lead follow-up phase, it uses intelligent algorithms to scale and grade screening, effectively activating dormant leads. In the renewal conversion phase, it can extract historical conversation memories and automatically trigger follow-up reminders, significantly improving institutional operational efficiency and allowing educational consultants to focus on high-value conversions, increasing employee efficiency by 40%.

WeChat screenshot_20250806094022.png

Currently, Baidu Intelligent Cloud's digital employees have been applied to online diversion communication and intelligent service follow-ups in the Baidu Customer Service Center. Data shows that digital employees efficiently handle customer inquiries 24/7, increasing user insurance application success rate by 60% and improving service timeliness by 18 hours. When talking about the future development of the industry, Zhi Zheng believes that human-computer collaboration is the mainstream trend at this stage, and in the future, multiple digital employees may collaborate to solve complex tasks.

The powerful performance of digital employees, characterized by "understanding business, delivering results, and being evolutionary," is rooted in the deep accumulation of Baidu Intelligent Cloud's AI full-stack capabilities. The core technical barriers of digital employees lie in four upgrades: "intelligent brain," "realistic appearance," "industry-specific core," and "evolutionary genes," which are accelerating the evolution of AI from a tool into a business partner.

"Intelligent Brain": A large model of speech and language driven by end-to-end cross-modal technology to achieve a business closed loop. In terms of language interaction, Baidu Intelligent Cloud has pioneered a cross-modal speech-language large model based on Cross-attention technology, achieving seamless delay communication, natural interruptions, and smooth conversations with a realistic human interaction experience. At the same time, Baidu Intelligent Cloud integrates speech recognition, large language models, and speech synthesis technologies, ensuring that digital employees can understand and comprehend in a very short time, and respond with the most suitable emotions and feedback according to the text, enabling autonomous judgment and closed-loop execution of business processes. Speech recognition accuracy reaches 98%, and conversation latency is reduced to within 1 second.

"Realistic Appearance": Film-level image and expression establish business trust. In terms of face effects, Baidu Intelligent Cloud remains at the industry's forefront. It applies the country's first 4D scanning technology to obtain high-precision vertex flow data and accurately restores facial muscle movements such as smiling and thinking through over a thousand control dimensions, presenting a film-level image. It only needs 30 seconds of voice material to replicate a high-fidelity voice, with naturalness comparable to real human voices, making the digital employee's image truly "perfect in both form and spirit."

"Industry-Specific Core": SOP accumulation and self-evolution create job experts. Zhang Hongguang, Deputy General Manager of Baidu Intelligent Cloud's Smart Marketing Product, shared that the professional and mature business capabilities of digital employees are rooted in the deep accumulation of industry vertical scenarios and the ability to continuously self-evolve. Drawing on the "10,000-hour rule," Baidu Intelligent Cloud has trained digital employees' initial capabilities with over 100,000 hours of industry operation data, having accumulated more than 100 specialized SOPs in vertical fields such as financial management standards, teaching and learning processes, and automotive sales and production knowledge.

"Evolutionary Genes": Self-iterative corporate knowledge genes. In addition, Baidu Intelligent Cloud also uses its self-developed simulation dialogue self-iteration system to dynamically integrate business rules and industry standards, turning repetitive work into job capabilities and achieving dynamic updates of the corporate knowledge base. With the accelerated iteration of large model capabilities and the continuous increase in data training, the business capabilities of digital employees keep evolving, allowing enterprises to continuously maintain cutting-edge professional perspectives and capabilities in their industries.