Xiaomi Opensources Latest Multimodal Large Model Xiaomi MiMo-VL-7B-2508
The Xiaomi large model team announced the open source of the latest multimodal large model Xiaomi MiMo-VL-7B-2508, which includes two versions: RL and SFT. Official data shows that the new model has set new records in four core capabilities: subject reasoning, document understanding, graphical interface positioning, and video understanding. Among them, the MMMU benchmark has broken through the 70-point mark for the first time, ChartQA has risen to 94.4, ScreenSpot-v2 has reached 92.5, and VideoMME has improved to 70.8.