On May 13, Kunlun AI Group officially announced the open-source release of the Matrix-Game large language model, which is an advanced model focused on interactive world generation, marking a new breakthrough in this technology. Matrix-Game represents the formal implementation of the Matrix series in the field of interactive world generation and is also the first industrial open-source 10B+ spatial intelligence large model, specifically designed for high-quality generation and precise control in open environments. This open-source initiative not only pushes the technical boundaries of interactive world generation but also sets a new benchmark for building general virtual world foundations.

Matrix-Game consists of three core components: the Matrix-Game-MC dataset, the Matrix-Game main model, and the GameWorld Score evaluation system. The Matrix-Game-MC dataset is a self-built large-scale interactive world dataset that includes massive unlabeled Minecraft gameplay videos as well as controllable video data of Minecraft and Unreal with keyboard and mouse control signals, featuring fine-grained action annotations. This dataset supports efficient modeling and learning of complex environmental dynamics and interaction patterns.

WeChat_Screenshot_20250513101341.png

The Matrix-Game main model is developed based on advanced diffusion model technology, capable of generating coherent and controllable interactive videos according to user input, balancing visual quality, temporal consistency, and physical plausibility. Through a two-stage training strategy (pre-training on unlabeled data followed by controlled training on labeled data), this model has achieved significant improvements in spatial understanding, user instruction response, and physical interaction modeling. Matrix-Game boasts fine-grained user interaction control capabilities, supporting detailed operations such as moving forward, jumping, attacking, and adjusting the camera view, offering an accurate and natural operational experience. Additionally, the generated results maintain visual continuity while adhering to natural physical laws like gravity and collision, significantly enhancing immersion. Furthermore, Matrix-Game demonstrates multi-scene generalization capabilities, covering various terrains, weather conditions, and biomes, with potential for generalization to non-Minecraft game environments.

To systematically evaluate and compare the performance of interactive world generation models, Matrix-Game proposes a unified GameWorld Score evaluation system. This system comprehensively quantifies model performance across four dimensions: video visual quality, temporal quality, motion controllability, and physical rule understanding, filling the gap in systematic evaluation benchmarks within this domain. In the GameWorld Score evaluation system, Matrix-Game leads in all four dimensions—visual quality, temporal consistency, motion controllability, and physical rule understanding—outperforming existing open-source baseline models Oasis and MineWorld. In double-blind user experiments, participants showed a preference for videos generated by Matrix-Game, highlighting its superior performance in the field of interactive world generation.

Project Homepage:

https://matrix-game-homepage.github.io

Technical Report:

https://github.com/SkyworkAI/Matrix-Game/blob/main/assets/report.pdf

GitHub Open Source Address:

https://github.com/SkyworkAI/Matrix-Game

HuggingFace Open Source Address:

https://huggingface.co/Skywork/Matrix-Game