Welcome to the "AI Daily" section! This is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers to help you understand technology trends and learn about innovative AI product applications.

Latest AI products Click to learn more:https://app.aibase.com/zh

1. Keling AI Launches Entity Library: The Model Has Memory Ability, Role "Never Changes Face"

Keling AI released the "Entity Library," adding long-term memory capabilities to the O1 multimodal video model, achieving role consistency over 96%, completely eliminating the AI face-changing issue. Users can generate 3D perspective completion and multiple light variants by uploading a single character image, and support cross-scene one-click calling.

image.png

【AiBase Summary:】

✨ Entity Library three-step process: Upload, complete, call, to enhance role consistency

🎨 AI intelligent description function, automatically extract keywords and improve generation success rate

🚀 Entity Library and O1 model unified entry point, realize seamless connection between text-image-video

2. Speak and Become the Character! Alibaba Qwen3-TTS Arrives: 49 Voice Types + 10 Languages 9 Dialects, WER Outperforms Mainstream Commercial Models

Alibaba launched Qwen3-TTS, which has zero-shot, multi-role, and cross-language characteristics, significantly outperforming mainstream commercial engines, suitable for education, live streaming, customer service, and other scenarios.

image.png

【AiBase Summary:】

🎧 49 high-quality voice types, covering various scenarios

🌐 Support 10 languages and 9 Chinese dialects

📉 WER significantly better than mainstream commercial models

Details Link: https://modelscope.cn/studios/Qwen/Qwen3-TTS-Demo

3. 406B Parameters Sudden Drop! Tencent Hunyuan 2.0 Begins Internal Testing, Claims "Top Tier Domestic Performance" in Inference

Tencent released the new self-developed large model Hunyuan 2.0, including Think and Instruct versions, with strong inference and instruction-following capabilities. The model performs well in complex tasks such as mathematics, science, and code, and has been launched on Tencent Cloud API and some applications.

image.png

【AiBase Summary:】

🧠 Hunyuan 2.0 uses MoE architecture, improving inference speed by 40%.

📊 The Think version achieved accuracy rates of 83.1% and 81.7% in IMO and Harvard-MIT competitions respectively.

💰 Tencent Cloud API pricing is only 45% of GPT-4o, supporting enterprise private deployment.

4. Meituan Open Sources 6B Parameter Image Generation Model LongCat-Image, Achieving SOTA Levels in Chinese Text Generation and Image Editing

The LongCat team from Meituan introduced the LongCat-Image image generation model with a 6B parameter scale, combining high performance and low threshold. It excels especially in Chinese text generation and image editing, reaching open-source SOTA levels. Through systematic training strategies and data engineering, the model ensures efficient performance and accuracy under diverse instructions. In addition, the LongCat team hopes to build a transparent, open, and collaborative ecosystem through open source, encouraging developers to participate in model use and development.

image.png

【AiBase Summary:】

🧠 The LongCat-Image model achieves open-source SOTA level in image editing, demonstrating strong instruction following and visual consistency capabilities.

🖋️ The model is optimized for Chinese text generation, capable of rendering complex stroke structure Chinese characters, meeting various scenario needs.

🎨 The LongCat team builds a transparent, open ecosystem through open source, encouraging developers to participate in model development and usage.

Details Link: https://longcat.ai/

5. JD Cloud JoyBuilder Supports GR00T N1.5 Thousand Card Training, Leading Embodied Intelligence Toward Large-Scale Deployment

JD Cloud JoyBuilder platform, through full-stack optimization, successfully supports GR00T N1.5 thousand card-level training, increasing training efficiency by 3.5 times, promoting the large-scale development of embodied intelligence.

image.png

【AiBase Summary:】

🧠 JD Cloud JoyBuilder platform completed a key upgrade, successfully supporting GR00T N1.5 thousand card-level training.

🚀 The platform achieved a 3.5 times improvement in training efficiency, significantly accelerating the large-scale deployment of embodied intelligence.

🌐 Supports the latest LeRobot training data protocol, establishing industry leadership.

6. NVIDIA's 4B Small Model Surpasses! Single Task Cost is Just 1/36 of GPT-5 Pro

NVIDIA's 4B small model NVARC achieved an excellent score of 27.64% in the latest ARC-AGI2 evaluation, defeating GPT-5Pro, showcasing its strong performance and cost advantages. NVARC improves model adaptability and efficiency through innovative zero-pretraining methods and synthetic data generation strategies.

image.png

【AiBase Summary:】

🧠 NVARC uses zero-pretraining deep learning methods, avoiding domain bias and data dependency issues of traditional large-scale datasets.

💡 NVARC uses GPT-OSS-120B to generate high-quality synthetic puzzles, reducing real-time computing resource requirements.

🚀 NVARC's TTFT technology allows it to quickly adapt to new task rules, enhancing model efficiency.

7. Weibo CEO Responds to AI Phone Can Automatically Post on Weibo Still Needs Confirmation

Weibo CEO Wang Gao Fei responded to questions about whether the Doubao AI phone can automatically post on Weibo, stating that although the feature is available, user confirmation is still required. At the same time, the Doubao AI phone faces login issues in mainstream applications, sparking discussions about AI operational capabilities. Wang Gao Fei mentioned that some game applications can detect AI control, limiting the use of AI assistants.

image.png

【AiBase Summary:】

🤖 Weibo CEO Wang Gao Fei said that whether the AI phone can automatically post on Weibo still needs confirmation, but it already has the relevant capability.

📱 The Doubao AI phone faces login restrictions in mainstream applications, sparking discussions about its AI operational capabilities.

⚙️ Currently, AI assistants still require manual operation for some applications, showing technical development bottlenecks and future challenges.

8. Microsoft Launches VibeVoice-Realtime: New Real-Time Text-to-Speech Model, Enhancing Interactive Applications

Microsoft's newly launched VibeVoice-Realtime-0.5B model is a lightweight real-time text-to-speech (TTS) system that supports streaming input and long speech output. The model can start generating speech within 300 milliseconds, suitable for proxy applications and real-time data storytelling. It uses an interleaved window design to optimize latency and improve speech synthesis quality, achieving a word error rate of 2.00% in the LibriSpeech test, demonstrating superior performance.

image.png

【AiBase Summary:】

🌟 Supports streaming text input, can start outputting speech within 300 milliseconds, suitable for real-time interactive applications.

🛠️ Uses a low-latency acoustic tagger, generating acoustic features at 7.5 Hz, optimizing long speech synthesis.

📈 In the LibriSpeech test, VibeVoice-Realtime achieved a word error rate of 2.00%, demonstrating superior performance, suitable for various application scenarios.

Details Link: https://huggingface.co/microsoft/VibeVoice-Realtime-0.5B