Top artificial intelligence not only needs to understand modern code jumping on the screen, but also needs to read the inscriptions on tortoise shells from three thousand years ago. According to OSCHINA, Tencent Hunyuan large model, SSV Digital Culture Lab, and other institutions have jointly launched "Chronicles-OCR" with several universities and the Palace Museum. This is the first industry benchmark for Chinese ancient characters that comprehensively covers the evolutionary trajectory of the "seven forms of Chinese characters".

To truly reflect the recognition capabilities of large models, this dataset has been cross-annotated by domain experts at multiple levels, containing 2,800 high-quality images that are strictly balanced. For ancient scripts such as oracle bone script, bronze script, and seal script, the team used fine-grained character-level annotations; while for more mature scripts like clerical, regular, running, and cursive scripts, they used sequence-level transcriptions that preserved the original reading order.
Major visual models all failed
The project team designed four core tasks that progressively advanced based on this benchmark, strictly decoupling the "visual perception" and "semantic reasoning" of large models. After evaluating 28 mainstream multimodal large language models, including GPT-5, Gemini 3.1 Pro, and Claude Opus 4.7, the results were surprising.
In the face of ancient scripts without modern formatting prior knowledge, mainstream large models completely failed in end-to-end detection tasks, with the highest accuracy for fine-grained recognition only reaching 27.1%. Surprisingly, the experiment showed that enabling the reasoning mode of large models actually amplified the uncertainty of perception, leading to further decline in recognition performance.
Reveal shortcomings in micro brush stroke recognition
Evaluation also found that when classifying scripts, current visual large models tend to recognize the texture of the carrier rather than distinguish the microscopic brush stroke styles. This means that today's top AI models are still far from truly "understanding" traditional Chinese ancient scripts.
Chinese characters have evolved from the Yin Dynasty oracle bones to the present, and every stroke carries the continuity of civilization. Chronicles-OCR's open source does not shy away from this technical reality. It provides a clear optimization direction for future visual large models to move from simple "reading characters" to deep "reading history", through visible gaps.

