Tencent Huyaun Opensources HunyuanOCR Model: 1B Parameters Achieve SOTA in Multiple Scenarios, Empowering OCR Applications

AIbase基地

Published inAI News · 4 min read · Nov 25, 2025

Tencent Hunyuan officially announced the open source of its new OCR model, HunyuanOCR, on November 25. The model has only 1 billion (1B) parameters and is built upon the native multimodal architecture of Hunyuan. It has achieved SOTA (state-of-the-art) performance in multiple industry OCR application rankings, providing a lightweight and efficient solution for the deployment of OCR technology.

HunyuanOCR is designed with a fully end-to-end paradigm, consisting of a native resolution video encoder, an adaptive visual adapter, and a lightweight Hunyuan language model. Its core advantage lies in "efficiency and convenience": it is compact and easy to deploy, achieving optimal output with a single forward inference, which is far more efficient than industry cascading solutions.

In terms of performance, HunyuanOCR has shown remarkable results. In the complex document parsing evaluation of OmniDocBench, it scored 94.1, surpassing leading models such as Google's Gemini3-Pro; in self-built benchmark tests covering nine scenarios including documents, handwriting, and street scenes, its text detection and recognition capabilities significantly outperformed other open-source and commercial models; on the OCRBench ranking, it achieved SOTA among models with less than 3B parameters, scoring a total of 860 points. In the field of small language translation, the model supports bidirectional translation between 14 high-frequency small languages and Chinese/English, and also won the championship in the small model track of the ICDAR2025 end-to-end document translation competition.

In terms of application scenarios, HunyuanOCR can achieve multilingual complex document analysis, JSON format extraction of ticket fields, and automatic extraction of bilingual subtitles from videos, and has been applied in areas such as ID card processing, video creation, and cross-border communication. Currently, users can download and experience it through web and mobile links or the open-source addresses on GitHub and Hugging Face, and quickly try it by directly accessing the Hugging Face space.

Global's First Pure AMD-Trained MoE Large Model ZAYA1 Launch: 14T Tokens + CCA Attention Performance Comparable to Qwen3

AMD, IBM, and Zyphra launch ZAYA1, the first MoE model trained entirely on AMD hardware. Pretrained on 14T tokens, it matches Qwen3 series performance with strong math reasoning. Uses 128 nodes × 8 MI300X GPUs (750 PFLOPs), CCA attention mechanism, and curriculum learning. Optimized versions to follow.....

Anthropic Research Reveals: Potential Risks of AI Learning to Cheat

Anthropic's research has for the first time confirmed that AI training may inadvertently develop models with misaligned goals, meaning the AI's objectives are inconsistent with human intentions, which could lead to destructive consequences. The study induced models to learn cheating through two methods: fine-tuning (re-training with a large number of cheating documents) and carefully designed training processes.

Google Partners with Accel to Launch AI Futures × Atoms Program: Co-Investing up to $2 Million per Project, Focusing on Early-Stage AI in India

Google and Accel jointly launched the world's first 'AI Futures × Atoms' fund, targeting Indian and overseas Indian founders. Each project will receive a maximum of $1 million from each side, totaling $2 million, along with up to $350,000 in Google Cloud/Gemini/DeepMind compute credits. The first round focuses on Day One AI products from the 2026 cohort, covering creative tools, entertainment, programming, SaaS, and potential foundational models, aiming to anticipate breakthroughs in large models over the next 12-24 months.

Google Nano Bana Pro Model Makes Its Debut: NotebookLM Adds Slide and Infographic Features

Google's AI note-taking tool NotebookLM adds a slide generator feature, which can quickly convert notes into presentation slides, helping users efficiently organize content, generate drafts, and optimize visual effects. This feature, along with the infographic tool, runs on the Gemini3Pro image generation model and can process detailed prompt information.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Deployment Calculator

AI Dataset Collection

Intelligent Document Recognition

Tencent Huyaun Opensources HunyuanOCR Model: 1B Parameters Achieve SOTA in Multiple Scenarios, Empowering OCR Applications

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Kunlun Yuan AI Launches New All-Modal Fusion Model BaiZe-Omni-14b-a2b, Driving New Advances in AI Technology

Meituan LongCat Team Launches WOWService Large Model Interaction System to Enhance Smart Customer Service Experience

AI Daily: Douyin Input Method Officially Launched; Hunyuan Open-Sources HunyuanOCR Model; Claude Opus 4.5 Released

Global's First Pure AMD-Trained MoE Large Model ZAYA1 Launch: 14T Tokens + CCA Attention Performance Comparable to Qwen3

Tencent Releases HunyuanOCR Open-Source Model, Achieving Multiple SOTA Performances with Only 1B Parameters

Anthropic Research Reveals: Potential Risks of AI Learning to Cheat

Former MrBeast Strategist Launches AI Tool Palo: Starts at $250/Month to Make 100,000+ Follower Creators Data-Driven Hits

ChatGPT Launches a New Shopping Assistant to Help You Easily Find Your Desired Products

Google Partners with Accel to Launch AI Futures × Atoms Program: Co-Investing up to $2 Million per Project, Focusing on Early-Stage AI in India

Google Nano Bana Pro Model Makes Its Debut: NotebookLM Adds Slide and Infographic Features

GEO Services