Welcome to the "AI Daily" section! Here is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers to help you understand technical trends and innovative AI product applications.
Fresh AI products Click to learn more:https://app.aibase.com/zh
1. Keling AI launches new digital human feature: generate a high-definition video in 1 minute from a single image
Keling AI's digital human feature has achieved a breakthrough from static images to dynamic videos. Users only need to provide a character image and text or audio input to quickly generate high-quality videos. This technology is based on multimodal understanding and video generation models, supporting various character creation and multilingual processing, providing new possibilities for content creation, education and training, and corporate promotion.
AiBase Highlights:
📷 Keling AI launched a digital human feature, transforming static images into dynamic videos.
🎙️ Supports multilingual processing, including Chinese, English, Japanese, Korean, etc.
💡 Lowers the barrier to video production, allowing ordinary users to easily create professional-level digital human videos.
Details link: https://klingavatar.github.io/
2. Tencent Hunyuan New Technology Removes "Oil" from Large Models, Making AI-Generated Images More Realistic!
Tencent Hunyuan team collaborated with the Chinese University of Hong Kong (Shenzhen) and Tsinghua University to launch SRPO technology, aiming to enhance the realism of AI-generated images and solve the issue of skin texture in Flux models. This technology introduced the "Semantic Relative Preference Optimization" strategy and used the Direct-Align strategy to optimize the generation trajectory, significantly improving image quality and training efficiency.
AiBase Highlights:
🧪 Introduces the "Semantic Relative Preference Optimization" strategy, using positive and negative words to guide signals to neutralize the bias of the reward model.
📈 Uses the Direct-Align strategy, injecting controllable noise and using it as a reference anchor point for image reconstruction, significantly reducing reconstruction error.
⚡ SRPO technology has extremely high training efficiency, surpassing existing methods in just 10 minutes, with a threefold increase in realism and aesthetic scores.
Details link: https://tencent.github.io/srpo-project-page/
3. IBM Launches Granite-Docling-258M: Open Source Enterprise-Level Document AI Model
IBM's Granite-Docling-258M is an open-source vision-language model that focuses on end-to-end document conversion. It can maintain the layout information of documents, extract elements such as tables, code, and formulas, and output structured machine-readable formats, showing significant improvements over traditional OCR technologies.
AiBase Highlights:
🌟 The new model Granite-Docling-258M aims to improve the accuracy of document conversion and maintain layout information.
🔧 Uses advanced technical architecture, performing well in multiple areas compared to the previous version SmolDocling.
🌍 Adds support for multiple languages, enhancing the model's application range and flexibility.
Details link: https://huggingface.co/collections/ibm-granite/granite-docling-682b8c766a565487bcb3ca00
4. Meta Launches First AI Glasses with Screen Ray-Ban: A Smart Assistant You Can Wear
Meta launched its first AI glasses with a screen, Ray-Ban, aiming to provide a more convenient smart experience and achieve precise control with a neural wristband, further reducing dependence on mobile devices.
AiBase Highlights:
📱 The right lens has an internal display that can show applications, reminders, and navigation information.
🧠 Used in conjunction with a neural wristband, it achieves precise control through electromyography technology.
🌐 Supports connection to the cloud, allowing users to use Meta's applications and view routes and real-time translation on the glasses.
5. DeepSeek Paper Appears on Nature Cover, AI Large Model Passes Peer Review for the First Time
The research paper of DeepSeek R1 successfully appeared on the cover of "Nature," marking the first time a large language model has passed authoritative peer review, setting a new academic standard for the AI industry. This model achieves self-evolution through reinforcement learning, enhancing reasoning capabilities and performing well in math competitions.
AiBase Highlights:
🧠 DeepSeek R1 evolves itself in a self-contained environment through reinforcement learning, developing complex reasoning capabilities.
📊 In the AIME2024 math competition, DeepSeek-R1's performance jumped from 15.6% to 71.0%, reaching a level comparable to OpenAI models.
🛠️ The DeepSeek team adopted a multi-stage training framework combining rejection sampling and supervised fine-tuning, enhancing the model's writing ability and overall performance.
6. OpenAI Announces New GPT-5 Thinking Adjustment Feature on ChatGPT Webpage
OpenAI launched a new 'Thinking Adjustment' feature, allowing users to choose the thinking duration of the GPT-5 model according to their needs, thus balancing response speed and intelligence. Additionally, OpenAI is actively developing a children's version of ChatGPT to ensure the safety of minors.
AiBase Highlights:
🌟 New feature launched: ChatGPT webpage introduces a function to adjust thinking duration, improving user experience.
🛠️ Multiple modes available: Users can choose standard, extended, lightweight, or heavy mode to meet different communication needs.
👶 Children's version development: OpenAI is developing a children's version of ChatGPT to ensure the safety of minors during use.
7. Douyin Launches "AI Truth" Function, Help You Distinguish Rumors and Find the Truth!
Douyin launched the "AI Truth" function, aiming to help users distinguish rumors and find the truth, enhancing information transparency and user protection capabilities.
AiBase Highlights:
🧠 The AI "Truth" function on Douyin is now online, helping users identify and clarify misleading information.
🔍 Users can click the link to jump to the "Truth Card" page for complete information.
📢 The platform enhances information transparency through a rumor governance large model and a fact-checking team.
8. Tongyi DeepResearch Launched! Fully Open-Source AI Model Makes Research Simpler
The fully open-source AI model released by the Tongyi DeepResearch team performed well in multiple authoritative benchmark tests, even outperforming many international well-known models, while promoting the development of AI research through an open approach.
AiBase Highlights:
🧠 The Tongyi DeepResearch team released a fully open-source AI model, enabling AI to move from "chatting" to "researching."
🚀 Achieved advanced results in multiple authoritative benchmark tests, with model performance surpassing many international well-known models.
🌐 The model, framework, and solutions are completely open source, providing a model for open collaboration in the global tech community.