Google AI research team recently launched DS STAR (Data Science Agent through Iterative Planning and Validation), a multi-agent framework designed to transform ambiguous business problems into executable Python code without the need for human analysts. Unlike traditional data science agents that rely on structured SQL databases, DS STAR is capable of directly processing mixed-format data files such as CSV, JSON, Markdown, and unstructured text.

image.png

The workflow of DS STAR is divided into several stages. First, the system analyzes each file in the data lake using an agent called Aanalyzer, generating Python scripts to extract key information such as column names, data types, and metadata. This step ensures that the system can obtain a structured view of each file, providing contextual information for subsequent analysis.

After completing the data analysis, DS STAR enters an iterative planning and validation loop. This process includes multiple agents, including Aplanner, Acoder, Averifier, and Arouter. Aplanner is responsible for creating initial executable steps, while Acoder converts these steps into Python code, which is executed to obtain observation results. Averifier evaluates the adequacy of the current plan based on the execution results, and if necessary, Arouter decides how to revise the plan. This loop continues until a result that meets the requirements is obtained or the maximum number of iterations is reached.

Additionally, DS STAR is equipped with Adebugger and Retriever modules to enhance the system's robustness. Adebugger repairs scripts when they fail, ensuring the system continues to function even in the face of pattern drift and missing columns. The Retriever is responsible for retrieving the most relevant files from large data sets to provide contextual support during the analysis process.

Through a series of benchmark tests, DS STAR has demonstrated excellent performance in multiple tasks such as DABStep, KramaBench, and DA Code, significantly improving the accuracy of analysis. This indicates that DS STAR can effectively transform complex data science problems into reliable Python solutions, advancing the automation of data analysis.

Paper: https://arxiv.org/pdf/2509.21825

Key Points:   

🌟 DS STAR is a multi-agent framework that can convert ambiguous business problems into executable Python code.   

📊 The system completes an iterative process of data analysis, code generation, and result verification through the collaboration of multiple agents.   

🚀 In benchmark tests, DS STAR significantly improved the analytical accuracy of data science tasks, demonstrating strong automation capabilities.