Welcome to the "AI Daily" section! This is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers to help you understand technical trends and innovative AI product applications.

Hot AI products Click to learn more:https://app.aibase.com/zh

1. Zhipu releases GLM-5V-Turbo Multimodal Coding Large Model

Zhipu's GLM-5V-Turbo multimodal coding large model achieves a deep integration of visual and programming capabilities, supports the invocation of multiple visual tools, and performs well in multiple core benchmark tests. The model's application scenarios include front-end replication, GUI autonomous exploration, and interactive editing, significantly improving development efficiency. At the same time, after integrating with AutoClaw agent, the agent now has true visual capabilities, able to interpret complex charts and output professional analysis reports.

image.png

AiBase Summary:

🧠 The multimodal base model GLM-5V-Turbo is released, achieving a deep integration of visual and programming capabilities.

💻 Supports front-end replication, GUI autonomous exploration, and interactive editing, improving development efficiency.

📊 After integrating with AutoClaw agent, it gains real visual capabilities, capable of interpreting complex charts and generating analytical reports.

2. ByteDance Volcano Engine Seedance 2.0 officially opens application for general API customers

ByteDance Volcano Engine officially opens the Seedance 2.0 API service, marking its multimodal video generation model's transition from closed experience to an open ecosystem, providing developers and enterprises with more powerful video creation tools.

image.png

AiBase Summary:

🎥 Seedance 2.0 supports four modal inputs: text, images, audio, and video, enhancing controllability of video generation.

💡 Provides cinematic-quality video generation, suitable for short drama production, e-commerce marketing, and other scenarios.

🔒 Emphasizes copyright protection; applying for open API requires enterprise certification and content review.

3. Meituan LongCat-AudioDiT Open Source: First Waveform Latent Space Modeling, Breaks SOTA in Voice Cloning

Meituan's LongCat-AudioDiT open source project achieves a breakthrough in voice cloning performance through waveform latent space modeling, with its innovative architecture and optimization techniques significantly improving the quality and stability of voice generation.

image.png

AiBase Summary:

🧠 Innovatively uses waveform latent space modeling, breaking away from the limitations of traditional mel-spectrogram intermediate representations.

🚀 Builds a minimalistic architecture using Wav-VAE and DiT, improving voice generation efficiency and quality.

🔧 Introduces dual constraint mechanisms and adaptive projection guidance technology, solving voice drift issues and optimizing generation results.

Details: https://github.com/meituan-longcat/LongCat-AudioDiT

4. Daily Token Consumption Exceeds 120 Trillion! ByteDance Douba Big Model Becomes the "Traffic King": Increased 1000 Times in Two Years

The article reports the significant progress made by ByteDance's Douba big model in AI applications, with daily token usage exceeding 120 trillion, demonstrating strong AI penetration. At the same time, the usage of domestic large models continues to grow, surpassing overseas mainstream models in some areas. Cloud vendors are re-evaluating the commercial value of tokens, making TokenHub a new competitive focus.

image.png

AiBase Summary:

🔥 Douba big model's daily token usage exceeds 120 trillion, showcasing strong AI application capabilities.

📈 Domestic large model usage continues to grow, surpassing overseas mainstream models in some areas.

🔄 Cloud vendors reassess token commercial value, making TokenHub a new battleground.

5. Ant Group's DTClaw Enters Internal Testing: Aiming at the Professional AI Agent Market

Ant Group officially announced that its professional lobster product, DTClaw, has entered internal testing, marking the company's official entry into the professional AI agent market. DTClaw is precisely positioned as a "professional AI," aiming to provide round-the-clock online exclusive AI agent services for financial experts, wealth consultants, and data analysts. Technologically, DTClaw emphasizes the "native expert" attribute, integrating hundreds of professional skills and pre-installing numerous mature "cooked lobster" templates. Its application scenarios accurately cover high-value fields such as investment and finance, complex data analysis, software development, and automated testing. In the industry trend where AI agents evolve from "assistants" to "experts," Ant Group's move demonstrates its strategy of deepening vertical industries and achieving an AI productivity loop.

image.png

AiBase Summary:

🧠 DTClaw is positioned as a professional AI agent, serving financial experts, wealth consultants, and data analysts.

🔧 DTClaw integrates hundreds of professional skills and pre-installs many "cooked lobster" templates, covering high-value fields such as investment and finance, complex data analysis, etc.

🚀 Ant Group enters the professional AI agent market through DTClaw, reflecting its strategic intent to deeply develop vertical industries and achieve an AI productivity loop.

6. Anthropic Tests "Lobster" Conway: Supports Independent UI, Webhook Wake-up, and Custom Extension Standards

Anthropic is developing a persistent agent solution called Conway, aimed at creating an always-on, independent intelligent environment for Claude. Conway will have an independent UI instance, supporting browser operations, external connector connections, and Claude Code features, while enabling automated responses via Webhook and introducing the CNW ZIP standard to enhance extensibility.

image.png

AiBase Summary:

📱 Independent UI instance, breaking the traditional chat interface limitations.

⚙️ Supports Webhook wake-up and external service connections.

📦 Launches CNW ZIP standard, building a custom extension ecosystem.

7. Google's Open Source Large Model Gemma 4 to be Announced Soon: Four Times More Parameters

Google's open source large model Gemma 4 is about to be released, with parameter count reaching 120B, four times that of the previous version, and using MoE architecture to optimize performance and efficiency. Meanwhile, Google maintains its influence in the developer community through open source projects, trying to compete with Chinese companies in localized services.

image.png

AiBase Summary:

🧠 Parameter count quadruples, Gemma 4 will challenge local operation limits.

🔄 Uses MoE architecture, balancing performance and efficiency.

🌍 The open source race enters a "parameters and efficiency" dual competition era.

8. AI Programming Enters the "Reliable" Era: Tongyi Lab Officially Releases Qwen3.6-Plus

Tongyi Lab released Qwen3.6-Plus, focusing on Coding Agent and long context, improving the stability and execution efficiency of intelligent agent programming, while achieving ecological compatibility and visual agent loop.

image.png

AiBase Summary:

🧠 Coding capability leaps forward: excels in scenarios like front-end page generation, code repair, and terminal automation.

🌐 Million-level context: default support for 1 million characters context window, significantly improving long document parsing and multi-turn dialogue information extraction accuracy.

🛠️ Ecological compatibility: seamlessly integrated with mainstream development tools, supports deep adaptation of various third-party programming assistants.