StepFun AI Releases New Audio Large Language Model Step-Audio-R1 with Significant Improvement in Audio Reasoning Capabilities
The StepFun AI team has launched the audio large model Step-Audio-R1, which solves the issue of declining accuracy in long reasoning chains for audio AI models by optimizing computing resource utilization. The research team pointed out that the problem stems from over-reliance on text data during training, causing the model to reason like reading text rather than actually listening to sounds.