Recent breakthroughs have been made in the application of artificial intelligence in the medical field. Baidu Intelligent, in collaboration with a research team from Tsinghua University, has officially launched a new generation of medical-enhanced large model - Baichuan-M4. With its outstanding logical reasoning and clinical knowledge base, the model performed exceptionally well in the authoritative HealthBench medical evaluation, securing the top position in three sub-charts of HealthBench as well as the Hard and Professional categories, demonstrating medical professional capabilities that surpass GPT-5.5.
The core evolution of Baichuan-M4 lies in the complete transformation of its interaction mode. Traditional AI-assisted diagnosis often remains in a "question and answer" passive state, while Baichuan-M4 simulates the diagnostic logic of real doctors, possessing the ability to ask questions proactively. It can analyze the information provided by patients in depth, identify clues of illness independently, and conduct targeted inquiries, greatly improving the efficiency and accuracy of the diagnostic process, making AI a reliable assistant in clinical decision-making.

To enhance the continuity of diagnosis, M4 introduced the "Full Course Memory" function. This technology can deeply integrate patients' historical medical records, multi-round consultation records, trends in laboratory indicators, and past medication feedback, ensuring that the model always has access to the patient's complete medical history during multiple conversations, without needing to start from scratch and repeat information. In long-context clinical memory tests, the model performed excellently, showing significant improvements compared to the previous generation.
In terms of clinical application safety and accuracy, Baichuan-M4 introduced the "Evidence Anchoring" technology. Every medical conclusion generated by the model can be precisely matched to specific paragraphs in authoritative medical papers or clinical guidelines, rather than simple literature references. Data shows that its evidence-based reference accuracy has reached 90.0, significantly outperforming other large models in the rigor of medical decision-making.



