On December 13, at the Second CCF China Data Conference, Ant Digital announced the open-sourcing of its key data agent technology, Agentar SQL, including all papers, code, models, and user guides. This intelligent agent technology enables non-professionals to perform business data queries and analysis through everyday language, providing a more accurate and usable intelligent data analysis foundation for enterprise digitalization.

Ant Digital's first open-source release is a real-time text-to-structured query language (Text-to-SQL) framework, offering developers an out-of-the-box data query solution that improves the efficiency of text-to-database query interactions. In 2026, Ant Digital will gradually open-source frameworks for database understanding and mining, industry knowledge mining, and real-time multi-turn interaction technology, covering the entire data capability chain from intent understanding, business understanding, to data understanding.

Reporters learned that during a pilot operation at a major city commercial bank, the average query accuracy of several tools in Ant Digital's Agentar SQL exceeded 92%, more than three times higher than traditional query solutions.

1765776642269.jpg

On September 25 this year, the Ant Digital data analysis intelligent agent Agentar-Scale-SQL, supported by this technology, topped the globally most authoritative natural language to structured query language (NL2SQL) evaluation benchmark BIRD-SQL, surpassing many domestic and international manufacturers such as Google. Currently, the intelligent agent still holds the top position in both the accuracy rate ranking and execution efficiency ranking, leading for over two months continuously.

BIRD-SQL requires AI models to convert natural language queries into SQL and execute them stably in real complex large-scale production databases. Its dataset covers 37 real industry scenarios, including finance, power, and healthcare, with a total of 33GB and more than 10,000 high-complexity query tasks, making it the most challenging NL2SQL test globally.

1765776695307(1).jpg

Research institutions predict that the global business intelligence market size will reach 47.48 billion in 2025. The Chinese business intelligence and analytics software market size is expected to reach 1.2 billion in 2025. It is estimated that by 2028, the Chinese business intelligence software market size will reach 1.79 billion USD, with a projected five-year compound annual growth rate (CAGR) of 12.7%, becoming an important and necessary investment area for building enterprise intelligence technology in the future.

Currently, there is a significant difference in how Chinese enterprises use business intelligence and analytics products, with most focusing on data visualization and simple analysis needs such as reports, dashboards, instrument panels, and data screens. However, improving usability in real production environments while maintaining accuracy is considered a common challenge for the large-scale deployment of NL2SQL in the industry.

Zhang Peng, head of AI technology at Ant Digital, pointed out at the conference that NL2SQL faces four serious challenges in practical implementation: understanding ambiguous and polysemous human speech, injecting vast industry-specific knowledge, parsing complex database structures and relationships, and generating accurate and error-free complex SQL statements. These challenges mean that simple model "wrapping" is far insufficient to meet the reliability and accuracy requirements of enterprise-level applications.

For example, professionals in the financial sector often need to combine complex business rules and multiple conditions to effectively analyze product data. In business management, non-professional data analysts' colloquial questions require the underlying product to correctly understand industry terminology and the user's intent, then accurately match database fields to produce real and accurate results.

Zhang Peng emphasized that BIRD-SQL mainly evaluates the ability to generate complex SQL (Online Scaling), but to truly achieve industrial-ready NL2SQL or data agent technology, a more complete capability stack must be built. In addition to online scaling, it also needs:

1. Offline Scaling: Deep understanding of the database and structuring of knowledge.

2. Human Interaction: The agent identifies its own uncertainty and proactively clarifies user intent, achieving transparent collaboration and error correction.

3. Self Evolution: Through "memory" optimization, creation and reuse of tools (such as UDFs) and other "no-tuning" technologies, enabling the agent to learn from errors, continuously improve, and reduce reliance on large amounts of labeled data and expert tuning.

Ant Digital plans to gradually open-source these more comprehensive capability modules, such as Agentar Profiling-SQL for understanding databases, and Agentar TuningFree-SQL for no-tuning evolution. The initial online scaling framework Agentar-Scale-SQL has been released on platforms such as arXiv, GitHub, ModelScope, and Hugging Face, and has quickly attracted the attention of developers.