Recently, the FAIR team of Meta and researchers from the Hebrew University of Jerusalem jointly released a new study showing that reducing the inference time of large language models can significantly improve their performance in complex reasoning tasks. The research results show that AI models using shorter inference chains have improved accuracy by 34.5%, challenging assumptions in the current AI industry.

Metaverse Science Fiction Cyberpunk Painting (1)Large Model

Source of Image: Images generated by AI, licensed by MidJourney

In this study, the authors pointed out that long reasoning chains do not necessarily lead to better reasoning ability but may result in wasted computing resources. In the past, many companies invested heavily in expanding computing capabilities in hopes that AI could solve complex problems through detailed steps. However, this study shows that shorter reasoning processes not only improve accuracy but also significantly reduce computing costs.

The research team proposed a new method called "short-m@k," which can execute multiple reasoning attempts in parallel and immediately stop computation after a few processes are completed. The final answer is selected through majority voting among these shorter reasoning chains. The results show that this method can reduce computing resources by up to 40% while maintaining performance. This approach provides important reference for organizations deploying large AI inference systems, allowing significant cost savings.

Additionally, the study found that using shorter reasoning instances during AI model training can further enhance model performance. This contradicts previous assumptions that longer reasoning training leads to better performance. In fact, shorter training instances yield better results.

In the AI industry, companies often rush to deploy more powerful models, which usually consume huge amounts of computing resources. This study's findings prompt technical decision-makers to rethink the computational methods used when testing LLMs (large language models). The research shows that prolonged "thinking" does not necessarily improve performance but may lead to declining results instead.

This research outcome holds significant importance for technology giants looking to save computing costs and improve performance. In an industry focused on expansion, teaching AI to think more simply not only saves computing resources but also enhances intelligence levels. Ultimately, artificial intelligence also benefits from the ancient wisdom of "not overthinking."

Key Points:  

🌟 Research finds that simplifying reasoning chains improves AI model accuracy by 34.5%.  

💡 The new method "short-m@k" can reduce computing costs by 40%.  

📈 Training with short reasoning instances further enhances AI performance, contradicting previous assumptions.