Recently, a research team from Renmin University of China, Shanghai Artificial Intelligence Laboratory, University College London, and Dalian University of Technology revealed an important finding about the reasoning process of large models: when the model is thinking, the "thinking words" it uses actually reflect a significant increase in its internal information. This research provides a new perspective for us to better understand the reasoning mechanisms of artificial intelligence through the method of information theory.
You may have seen large models output some language that seems human-like when answering questions, such as "Hmm...", "Let me think...", or "Therefore...". Are these "thinking words" just superficial decorations, or do they represent the model's real thinking process? This question has puzzled many researchers. Recent studies show that these words are not simply for imitating humans, but are key "information peaks," indicating the model's mental state at specific moments.
Image source note: The image is AI-generated, and the image licensing service is Midjourney
The research team tracked and observed various large models, measuring changes in mutual information during their reasoning processes. The results showed that the mutual information values of the models increased sharply at certain moments, forming distinct "mutual information peaks." This means that at these critical moments, the model contains key information pointing toward the correct answer. This phenomenon is particularly evident in models that have undergone reasoning-enhanced training, while non-reasoning models appear more平淡.
More interestingly, when researchers converted the representations at these mutual information peak moments into human-understandable language, they found that these moments exactly corresponded to the frequently occurring "thinking words." For example, when performing complex reasoning, the model often outputs expressions like "Let me think" or "So I must..." These "thinking words" are no longer optional embellishments, but key landmarks in the model's reasoning process, driving its thinking forward.
Based on this discovery, the researchers proposed two methods to enhance the reasoning ability of large models without additional training. This means that future AI can significantly improve its reasoning performance by reasonably utilizing these information peaks while retaining existing knowledge. This study not only advances the theoretical research of large models but also provides new ideas for practical applications.