Welcome to the "AI Daily" section! Here is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers to help you understand technology trends and learn about innovative AI product applications.

Fresh AI products Click to learn more:https://top.aibase.com/

1. Google releases the stable version of Gemini 2.5 Flash-Lite: A perfect balance between speed and cost

Google has released the stable version of Gemini 2.5 Flash-Lite, which achieves a good balance between speed and cost, supports up to 1 million tokens of context, and provides various advanced features. Its pricing strategy is competitive, and it outperforms previous versions in performance.

image.png

AiBase Summary:

⚡Gemini 2.5 Flash-Lite is the fastest and most cost-effective AI model released by Google, now in general availability (GA)

💰 Pricing for one million input tokens is $0.10, and output is $0.40, with a 40% reduction in audio input prices

🔧 Developers can use the new version by specifying the model name gemini-2.5-flash-lite, and the previous preview version alias will be removed on August 25th

2. Tencent Huan Yuan's self-developed ASR speech recognition large model is integrated into ima platform

The application of Tencent Huan Yuan's ASR large model on the ima platform provides users with a more efficient voice input experience. The model has strong semantic understanding capabilities, especially excelling in scenarios with mixed Chinese and English, and supports multiple application scenarios such as knowledge base Q&A and note creation.

image.png

AiBase Summary:

✅ The ASR large model of Tencent Huan Yuan realizes voice input function on mobile App, improving input efficiency.

💡 It uses a streaming ASR architecture based on dual encoders, significantly improving semantic understanding ability.

🌐 Supports multi-language and dialect recognition, and will continue to optimize to meet diverse needs.

3. Tongyi Qianwen opens the latest AI programming large model Qwen3-Coder

Alibaba Cloud announced that its latest AI programming large model Qwen3-Coder is fully open-sourced, reaching top levels in code generation and Agent capabilities, bringing new breakthroughs in intelligent programming technology. Qwen3-Coder has a powerful MoE architecture and long context processing capabilities, suitable for large-scale code repositories and dynamic data processing.

image.png

AiBase Summary:

🔥 Qwen3-Coder adopts an advanced MoE architecture, with a parameter count of 480B, supporting a context length of 256K.

💡 The pre-training stage improves code capabilities through multi-dimensional expansion strategies, with 70% of the 7.5T training data being code.

🚀 Open-source tools Qwen Code enhance parser and tool support, improving the developer experience.

Details link: https://modelscope.cn/models/Qwen/Qwen3-Coder-480B-A35B-Instruct Hugging Face: https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507 Qwen Code GitHub: https://github.com/QwenLM/qwen-code

4. 360 will launch smart glasses and AI voice recorder, Zhou Hongyi: Glasses have display functions

Zhou Hongyi, chairman of 360 Company, revealed that the company will launch an AI voice recorder and smart glasses. The AI voice recorder can intelligently analyze scenes and summarize key points, while the smart glasses need display functions to create new application scenarios, such as teleprompters and translation tools, to improve communication efficiency.

image.png

AiBase Summary:

🧠 AI voice recorders have the ability to intelligently analyze different scenarios and accurately summarize key points.

👓 Smart glasses need to be equipped with display functions to highlight their advantages and create new application scenarios.

🌐 Smart glasses can act as teleprompters and translation tools, improving communication efficiency.

5. The first large model in China to pass the chief physician evaluation has been launched on Kuaizhe AI search

Kuaizhe Health Large Model successfully passed the chief physician written test evaluation, demonstrating its strong reasoning ability in the medical field and has been integrated into AI search. The model enhances the ability to handle complex medical issues by building a 'slow thinking capability' and a high-quality data training system, and has a professional physician team to ensure the professionalism and accuracy of the model's output.

image.png

AiBase Summary:

🧠 Kuaizhe Health Large Model passed the chief physician written test evaluation, demonstrating medical reasoning ability.

🔍 Building 'slow thinking ability' to enhance the ability to derive complex medical problems in stages.

👩‍⚕️ Having a thousand-person professional physician annotation team to ensure the professionalism of the model's output content.

6. Lingyi Wanwu launches the Wanzhi Enterprise Large Model Platform and launches enterprise-level Agent intelligent entities

Lingyi Wanwu launches the Wanzhi Enterprise Large Model Platform and introduces enterprise-level Agent intelligent entities, providing enterprises with customized and high-performance AI solutions to drive the arrival of the industrial AI era.

image.png

AiBase Summary:

🧠 The launch of Wanzhi Enterprise Large Model Platform 2.0 and enterprise-level Agent marks an important advancement in AI technology in the enterprise application field.

🔒 Enterprise-level Agent has a secure sandbox and MCP protocol to ensure data security and system isolation.

💼 Agent has been implemented in areas such as business, finance, sales, and games, enhancing productivity and task planning capabilities.

Details link: https://www.lingyiwanwu.com/businesspartnership

7. Hedra Live Avatars震撼发布!每分钟仅0.05美元,视频AI代理开启人机交互新纪元

The release of Hedra Live Avatars marks a major breakthrough in AI video generation technology. With ultra-low cost, ultra-low latency, and high flexibility as core advantages, it brings new possibilities to fields such as content creation, education, customer service, and gaming.

image.png

AiBase Summary:

⚡ Ultra-low cost: Only $0.05 per minute, greatly lowering the entry barrier for high-quality video AI agents.

⚡ Ultra-low latency: Response time below 100 milliseconds, ensuring smooth and immersive real-time interaction.

⚡ High flexibility: Compatible with mainstream large language models and text-to-speech technologies, supporting personalized interactive experiences.

Details link: https://www.hedra.com

8. Google Gemini2.5 revolutionizes image processing: Not only identifying objects, but also understanding abstract concepts and relationships

Google's new feature "Conversational Image Segmentation" in the Gemini2.5 AI model can analyze and highlight image content through natural language prompts, surpassing traditional image segmentation technology, supporting relationship queries, logic-based instructions, and understanding of abstract concepts. This function has wide applications in image editing, workplace safety, and the insurance industry, and provides developers with a convenient API interface.

image.png

AiBase Summary:

🧠 Capable of understanding and responding to more complex and semantically rich natural language instructions

🌐 Supports multilingual prompts and can provide object labels in other languages

🔧 Developers can access this functionality directly via the Gemini API, returning JSON format results

9. Meta launches innovative model AU-Nets, revolutionizing text processing

META's AU-Net model uses a self-regressive U-Net structure to achieve flexible text processing, capable of learning from raw bytes and dynamically combining them into multi-level sequence representations, providing new ideas for the development of large language models.

image.png

AiBase Summary:

🚀 AU-Net architecture dynamically combines bytes into multi-level sequence representations using a self-regressive approach.

📊 Uses contraction and expansion paths to ensure effective integration of macro semantic information and local details.

⏩ Self-regressive generation mechanism improves inference efficiency and ensures the coherence and accuracy of text generation.

Details link: https://github.com/facebookresearch/lingua/tree/main/apps/aunet

10. Apple AI team internal turmoil: Independent development and open source dream shattered, may seek third-party large models!

Apple's AI team caused internal dissatisfaction due to the obstruction of the open source plan. Senior Vice President Federici believes that there are already enough open source models in the market, and Apple's model lacks performance on device side. At the same time, Apple delayed the update of Siri and considered cooperating with third-party large models, highlighting its strategic adjustment in AI development.

image.png

AiBase Summary:

🍎 Apple AI team's open source plan was rejected by senior management, concerned about insufficient model performance.

⚙️ Apple insists on device-first strategy, limiting the development potential of AI technology.

🤖 Apple may turn to collaborate with third-party large models like OpenAI and Google to enhance Siri functionality.

11. One-click generation of teaching animations! Fogsight AI revolutionizes educational presentations, turning abstract concepts into instant animations

Fogsight is an AI animation engine based on large language models, capable of transforming abstract concepts into intuitive and easy-to-understand animations. By inputting keywords or phrases, it automatically generates animated short films with bilingual narration and cinematic visual effects, suitable for classroom teaching, online courses, and science popularization content creation.

image.png

AiBase Summary:

🎥 One-click generation: Users input keywords to generate narrative complete animations of 30 seconds to 90 seconds.

🎨 Visual and fun: The animation has cinematic visual effects, enhancing learning interest.

🛠️ Interactive interface: Supports multi-round dialogue to adjust animation content, meeting personalized needs.

Details link: https://github.com/fogsightai/fogsight