Welcome to the "AI Daily" section! This is your guide to exploring the world of artificial intelligence every day. Every day, we bring you the latest content in the AI field, focusing on developers to help you understand technical trends and innovative AI product applications.
Fresh AI products click to learn more:https://app.aibase.com/zh
1. Meituan launches the new reasoning large model LongCat-Flash-Thinking
The LongCat-Flash-Thinking model launched by Meituan has shown excellent performance and a flexible architecture in multiple fields, providing new possibilities for AI application development.
【AiBase Summary:】
🧠 LongCat-Flash-Thinking is a large reasoning model based on a hybrid expert architecture, with up to 56 billion parameters, dynamically activating 18.6 to 31.3 billion parameters.
📊 It performs exceptionally well in tasks such as mathematical reasoning, general reasoning, and code generation, even achieving top accuracy in some tests.
🔧 The model weights are open-sourced, and detailed chat templates and dedicated chat websites are provided for easy use and research by developers.
More details: https://longcat.chat/
2. One image generates animation, character replacement seamlessly! Wan-Animate open source triggers an AI video revolution, Alibaba's black technology allows free play?
The open-source release of the Wan-Animate model marks a major breakthrough in AI video generation technology, bringing revolutionary changes to video creation with its dual-task processing capabilities and multi-modal fusion technology.
【AiBase Summary:】
🎭 Dual-task one-click solution: Wan-Animate can simultaneously solve the problems of character animation generation and character replacement. Users only need to provide one image and a reference video to generate high-precision animation videos.
💡 Multi-modal fusion driven: The model integrates skeletal signal control for body movement, facial implicit feature extraction, and Relighting LoRA module optimization for environmental lighting, improving lip synchronization accuracy and full-body action replication effects.
🚀 Broad application prospects: Wan-Animate has great potential in entertainment and commercial scenarios, such as music video creation, e-commerce advertising, or corporate training, and is expected to expand to support multi-character videos in the future.
More details: https://github.com/Wan-Video/Wan2.2
3. ByteDance Launches Doubao Translation Large Model: 28 Languages Intertranslation, Comparable to GPT-4o
ByteDance's Volc Engine has launched a new general translation large model called Doubao Translation Model, which supports inter-translation of 28 languages and its performance has reached or exceeded market-leading models like GPT-4o and Gemini-2.5-Pro. In addition, the Doubao Translation Model also shows excellent pricing, with only 1.20 yuan per million characters input and 3.60 yuan for output.
【AiBase Summary:】
🤖 The Doubao Translation Model supports inter-translation of 28 languages and is comparable to GPT-4o and Gemini-2.5-Pro.
💰 The translation price is highly competitive, with only 1.20 yuan per million characters input and 3.60 yuan for output.
🔗 Pricing information can be found in the official documentation of Volc Engine for detailed information.
More details: https://www.volcengine.com/docs/82379/1820188
4. Huawei and Zhejiang University jointly launch DeepSeek-R1-Safe Large Model: Perfect Balance Between AI Safety and Performance
Haier and Zhejiang University have jointly launched the first domestic basic large model based on the Ascend Qianka computing platform, DeepSeek-R1-Safe. This model has made significant breakthroughs in AI safety and performance, providing a new direction for the coordinated development of the future AI industry ecosystem.
【AiBase Summary:】
🧠 DeepSeek-R1-Safe is built on the Ascend Qianka computing platform, focusing on solving AI safety and performance issues.
🛡️ The model shows excellent performance in multiple harmful information defense dimensions, with a total defense success rate close to 100%.
🚀 In general ability benchmark tests, the performance loss of DeepSeek-R1-Safe is controlled within 1%, achieving a balance between safety and performance.
5. Qwen3-Omni Will Be Released: Cross-modal Model on the Terminal Upgraded
Qwen3-Omni is the latest cross-modal model launched by Alibaba Cloud's Qwen team, and it is expected to be officially released soon. The model has submitted a PR to the Hugging Face Transformers library, marking the realization of open-source integration. Qwen3-Omni adopts a Thinker-Talker dual-track design to improve deployment efficiency on resource-constrained devices and is suitable for real-time interaction scenarios.
【AiBase Summary:】
🔥 Qwen3-Omni is Alibaba Cloud's latest cross-modal model, aimed at enhancing multi-modal processing capabilities.
💡 The model uses a Thinker-Talker dual-track design to ensure efficient streaming processing, suitable for real-time interaction scenarios.
🚀 Qwen3-Omni has submitted support PR to the Hugging Face Transformers library, marking the realization of open-source integration.
6. xAI Releases Grok4Fast: 40% Less Computation, Single Task Cost Drops to 98%!
The Grok4Fast model released by xAI has achieved a major breakthrough in computation and operational costs, while also performing well in performance tests, providing users with an efficient and economical solution.
【AiBase Summary:】
🧠 Grok4Fast reduces computational requirements by 40%, improving the efficiency of handling complex tasks.
💰 The cost of running a single task is reduced by 98%, offering opportunities to save enterprise expenses.
📊 It performs exceptionally well in GPQA Diamond and AIME2025 benchmarks, demonstrating strong performance.
7. YouTube Launches New Tools and Features to Help Creators Achieve Greater Success
YouTube announced several new features and tools during its annual event, covering live streaming, monetization methods, and AI-assisted creation. These updates aim to improve content management efficiency and viewer engagement for creators.
【AiBase Summary:】
🎥 New studio features: Introduce inspiration tags, title A/B testing, and portrait recognition tools to help creators manage content.
🎮 Live streaming upgrade: Support games, landscape and portrait live streaming, and AI automatic highlighting features, enhancing the live streaming experience.
💰 New monetization methods: Through brand collaborations and shopping plans, creators can gain more revenue opportunities.
8. IBM Launches Granite-Docling-258M Model, Bringing Breakthroughs in Document Conversion Technology
IBM released the lightweight visual language AI model Granite-Docling-258M, designed specifically for document processing. The model excels in recognition accuracy, multilingual support, and handling of document elements, retaining the original document layout structure and supporting various output formats.
【AiBase Summary:】
📄 Lightweight model: Granite-Docling-258M is designed for file conversion, with 258 million parameters.
🔍 High accuracy: The model significantly improves recognition accuracy compared to traditional OCR software.
🌍 Multilingual support: Currently supports Chinese, Arabic, and Japanese, with plans to expand to more languages in the future.
More details: https://huggingface.co/ibm-granite/granite-docling-258M
9. Chinese Academy of Sciences Launches SpikingBrain, a Brain-like Large Model: Achieving a Hundredfold Speed Increase with 2% Data
The brain-like large model SpikingBrain launched by the Chinese Academy of Sciences demonstrates amazing speed and efficiency when processing long texts. Its innovative architecture and algorithms have brought significant breakthroughs to the field of artificial intelligence.
【AiBase Summary:】
🧠 The SpikingBrain model uses a hybrid linear attention architecture, reducing computational complexity from quadratic to linear.
💡 The adaptive threshold spiking neuron mechanism significantly reduces energy consumption, achieving high computational sparsity.
🚀 The model processes long texts 100 times faster than mainstream models, using only 2% of the training data.
More details: https://github.com/BICLab/SpikingBrain-7B
10. OpenAI CEO Reveals New Compute-Intensive Feature, Limited to Pro Users
OpenAI CEO Sam Altman announced that the company will launch a series of new services requiring more computing resources in the coming weeks, initially available only to Pro subscribers and possibly involving additional fees. Despite this, Altman stated that OpenAI's goal is to reduce the cost of intelligent services, making them more widely accessible.
【AiBase Summary:】
🚀 OpenAI will launch compute-intensive new services, initially limited to Pro users.
💰 New features may involve additional fees to cope with high computing costs.
💡 Altman emphasized that reducing the cost of intelligent services and increasing accessibility is a long-term goal.