On Thursday, the Loud Institute announced the launch of its first "Slingshot" artificial intelligence funding program, aimed at "advancing the science and practice of artificial intelligence." The program accelerates AI research and innovation by providing researchers with resources that traditional academic institutions are unable to match—including funding, computing power, and product and engineering support. In return, recipients are required to produce tangible outcomes, such as startups, open-source projects, or other forms of research results.
There are 15 projects selected in the first round, focusing on one of the most challenging issues in the current field of artificial intelligence—AI evaluation systems. Several of these projects are already well-known in the industry, such as the command-line coding benchmarking tool Terminal Bench, and the latest version of the ARC-AGI project, which has long focused on evaluating general artificial intelligence (AGI) capabilities.

At the same time, multiple teams are trying to solve the evaluation bottleneck from a new perspective. The Formula Code project developed by Caltech and the University of Texas at Austin aims to evaluate how well AI agents perform when optimizing existing code; the BizBench launched by a team from Columbia University is designed to build a comprehensive testing standard for "white-collar AI agents," focusing on the real performance of AI in business and decision-making tasks. Additionally, some projects are exploring new approaches to reinforcement learning and model compression to establish a more universal and scalable evaluation framework.
Notably, John Boda Yang, co-founder of SWE-Bench, has also joined this round of the program. He will lead the new project CodeClash. Inspired by the success of SWE-Bench, this project plans to assess AI coding abilities through a dynamic, competitive mechanism.
Yang said in an interview with TechCrunch: "I believe that continuously using third-party core benchmarks for open evaluation is key to driving progress across the entire industry. However, I am also concerned that if future evaluation systems are monopolized by individual companies, it could weaken the openness and comparability of research."
Through the "Slingshot" program, the Loud Institute is attempting to build a new bridge between academia and industry, allowing cutting-edge AI research to be transformed into practical applications more quickly. This initiative is seen as an important step in reshaping the evaluation system in the current AI field.




