AI Daily: Kuaishou Launches AI Video Creation Assistant Kwali; ByteDance Launches USO Model; OpenAI Launches ChatGPT Developer Mode

Welcome to the "AI Daily" section! Here is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers, helping you understand technical trends and innovative AI product applications.

Fresh AI products Click for more information:https://app.aibase.com/zh

1. Kuaishou launches Kwali, an AI video creation assistant, to easily generate short videos with one sentence!

Kuaishou's Kwali AI video creation assistant simplifies the video production process through a cloud-based multi-Agent framework. Users just need to input their requirements, and Kwali can automatically break down features, target audience, and scenario tags, generating scripts, matching shots, and editing them, significantly improving efficiency.

AiBase Summary:
🌟 Kwali is an AI video creation assistant launched by Kuaishou, helping users quickly generate high-quality short videos.
🎬 The multi-Agent system automatically handles scripts, materials, and editing, improving the efficiency of video production.
💰 Reduces video production costs, allowing merchants to bring products to market faster and improve cash flow.
Details link: https://kc.kuaishou.com/kwali

2. ByteDance launches USO model, breaking the "style and theme" opposition in AI image generation

ByteDance's USO model successfully solves the contradiction between style-driven and theme-driven image generation, enhancing the flexibility and accuracy of image generation through innovative training methods and a large dataset. This model is now fully open-source, bringing new possibilities for digital art and commercial design.

AiBase Summary:
🎨 The USO model breaks the opposition between style and theme, achieving a perfect combination of both.
📊 The USO model improves the flexibility and accuracy of image generation through innovative training methods and a large dataset.
🌍 The USO model is fully open-source, encouraging developers to explore its application in creative content and commercial design.
Details link: https://github.com/bytedance/USO

3. Microsoft introduces a new Copilot Audio mode for personalized voice interaction

Microsoft has launched a new Copilot Audio mode based on its self-developed MAI-Voice-1 model, offering three voice modes: emotional, storytelling, and script, to meet different expression needs in various scenarios. Additionally, this feature provides a wide range of voice and style options, enhancing the user experience. Furthermore, Microsoft introduced the MAI-1 model and integrated it into Office applications, further promoting its independent development in the AI field.

AiBase Summary:
🎭 The new Copilot Audio mode supports three voice modes: emotional, storytelling, and script, meeting different scenario needs.
🎙️ Provides multiple voice and style choices, such as Shakespearean reading and sports commentary, enhancing interactive fun.
🔍 Microsoft introduced the MAI-1 model and integrated it into Office applications, showing its determination to pursue independent development in the AI field.
Details link: https://copilot.microsoft.com/labs/audio-expression

4. Stability AI releases Stable Audio 2.5, a professional audio generation technology upgrade

Stability AI released the latest audio generation model Stable Audio 2.5, which can quickly generate high-quality, customizable audio works, support complex music creation, and introduce an audio repair function. At the same time, it collaborates with WPP to provide consistent brand audio identification services.

AiBase Summary:
🎵 The new model Stable Audio 2.5 supports generating complex music works and quickly generates audio tracks up to three minutes long.
🖌️ Introduces an audio repair function, allowing users to upload audio files and let AI complete or expand recordings.
🤝 Stability AI collaborates with major clients like WPP to provide consistent brand audio identification services.

5. UAE Launches the World's Fastest Open Source AI Model K2 Think, with 32 Billion Parameters

K2Think is an open-source large language model jointly launched by the Mohamed bin Zayed University of Artificial Intelligence and G42AI in the UAE. It is known for its 32 billion parameters and a generation speed of 2000 tokens per second. It performs well in complex mathematics, programming, and scientific benchmarks and uses efficient reasoning design to achieve excellent performance with fewer computing resources. In addition, K2Think provides complete training data, model weights, and deployment infrastructure, supporting commercial applications and is seen as a symbol of the UAE's growing influence in the global AI field.

AiBase Summary:
🧠 K2Think is the fastest open-source AI model in the world, developed by the UAE, with 32 billion parameters.
⚡ It can generate 2000 tokens per second, much faster than other models.
🚀 The model focuses on complex reasoning, with efficient and open design, supporting a wide range of commercial applications.
Details link: https://www.k2think.ai/guest

6. WeChat Official Accounts Launch Intelligent Reply Function: Digital Avatar 7*24 Hours Chat

WeChat official accounts have launched an intelligent reply function that uses artificial intelligence technology to provide operators with efficient and personalized interaction services, enhancing user experience and the operational efficiency of official accounts.

AiBase Summary:
🤖 Official account operators can easily enable the intelligent reply function to improve interaction efficiency.
💡 Digital avatars can learn from historical articles and language styles to provide personalized replies.
🌐 Intelligent replies support 7*24 hours online, enhancing user engagement and interaction experience.

7. OpenAI Launches ChatGPT Developer Mode, First Supporting AI Directly Controlling External Tools

The launch of OpenAI's ChatGPT developer mode marks a significant transformation of AI assistants from conversation tools to automated agents, supporting AI directly controlling external tools to enhance development efficiency and security.

AiBase Summary:
🧠 ChatGPT developer mode first supports AI directly controlling external tools, realizing automated agent functions.
🔧 Developers can create custom connectors to allow ChatGPT to perform write operations and complex tasks.
🔒 The function adds multiple layers of security measures to ensure the accuracy and safety of operations.
Details link: https://platform.openai.com/docs/mcp https://platform.openai.com/docs/guides/developer-mode

8. ByteDance Seed Launches New AgentGym-RL Framework: Enhancing Decision-Making Ability of Large Language Models

The article introduces the AgentGym-RL framework launched by the Seed research team at ByteDance, which focuses on training large language model agents through reinforcement learning to enable multi-turn interaction decision-making. They also proposed a training method called ScalingInter-RL to optimize the learning effect of agents. Experimental results show that the AgentGym-RL framework outperforms commercial models in multiple tasks and has capabilities comparable to top proprietary large models.

AiBase Summary:
🌐 The AgentGym-RL framework provides a new approach to train large language model agents through reinforcement learning, enhancing their decision-making ability for complex tasks.
🔄 The ScalingInter-RL training method helps agents achieve effective exploration and utilization balance during training by adjusting interactions in stages.
🏆 Experimental results show that the AgentGym-RL framework significantly improves agent performance, surpassing multiple commercial models and possessing capabilities comparable to top proprietary large models.
Details link: https://agentgym-rl.github.io/

9. Big News! Moonshot AI Opens Revolutionary Middleware "Checkpoint Engine," Bringing New Opportunities to LLM Inference Engines!

Moonshot AI's open-sourced "Checkpoint Engine" middleware is specifically designed for large language model (LLM) inference engines, achieving efficient in-place hot updates. Its performance is outstanding, capable of synchronizing weight updates for a 1 trillion parameter model within 20 seconds and supporting thousands of GPUs for parallel processing, significantly reducing downtime and improving training efficiency.

AiBase Summary:
🚀 Checkpoint Engine realizes efficient real-time updates of model weights in LLM inference engines.
⚡ Supports thousands of GPUs for parallel processing, greatly reducing downtime in reinforcement learning training.
🌐 Open design facilitates future expansion to other frameworks, such as SGLang, promoting technological advancement.

10. Bilibili Open Sources Text-to-Speech Model IndexTTS-2.0, Emotion and Duration Controllable

Bilibili open-sources its independently developed text-to-speech system IndexTTS-2.0, which has the characteristics of emotion control and duration adjustment, marking an important step toward the practical application of zero-shot TTS technology. By introducing a time encoding mechanism and separating voice and emotion modeling, it enhances the naturalness and expressiveness of speech synthesis and is widely used in AI dubbing, audiobooks, video translation, and other scenarios.

AiBase Summary:
🕒 Introduces a time encoding mechanism to improve the precision of speech duration control.
🎭 Voice and emotion decoupling modeling enhances the expressiveness of speech.
🌍 Supports global content export, achieving localized experiences for cross-language videos.
Details link: https://huggingface.co/spaces/IndexTeam/IndexTTS-2-Demo

11. Replit Launches More Autonomous Agent 3, Autonomy Increased by 10 Times, Programming Efficiency Soars!

Replit's Agent 3 is an intelligent programming assistant with higher autonomy, significantly improving its abilities in code generation, debugging, and project management. It can generate high-quality code according to user needs and actively provide optimization suggestions, thereby improving development efficiency.

AiBase Summary:
🧠 Agent 3 can generate code based on natural language requirements and actively analyze the project context to provide optimization suggestions.
⚙️ Supports multiple programming languages and has full-process assistance capabilities, including code generation, debugging, and project management.
🚀 Improves development efficiency, reduces repetitive work, and focuses on solving creative problems.
Details link: https://replit.com/agent3

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

AI Daily: Kuaishou Launches AI Video Creation Assistant Kwali; ByteDance Launches USO Model; OpenAI Launches ChatGPT Developer Mode

站长之家

This article is from AIbase Daily

AI News Recommendations

The Rise of an AI Voice Giant! ElevenLabs Secures $5 Billion in Funding, Valuation Surges to $11 Billion, Becoming the World's Most Expensive AI Voice Service Provider

Google and Apple Team Up Strongly! The Next Generation AI Model is About to Be Released

First in Olympic History! Milan Winter Olympics Announces the Adoption of Alibaba Qwen's Official Large Model

First in History! Alibaba Qwen Builds an Official Large Model, Milan Winter Olympics Enter the AI Era

Legal Industry Panics Following the Release of Anthropic AI Plugins

Apple and Google Deep Collaboration: New Version of Siri Computation May Be Launched on Google Cloud Servers

Avoiding Discussion of the $1 Billion Apple Deal: Alphabet's Earnings Call Takes a Cold Approach to AI Collaboration Details, Behind the Scenes Lies Monetization Anxiety

Apple Official Support App Major Version Update: AI Customer Service Goes Permanent, Diagnostic Features Upgraded

Kunlun Tech Launches TianGong Skywork Desktop Version: Creating the Strongest AI Brain for Personal Computers

The Only Giant AI Startup Representative! Yang Zhilin from Moonshot AI Invited to NVIDIA GTC 2026 Conference