This is the MXFP4_MOE quantized version of the MiniMax-M2-REAP-172B-A10B model, which is a memory-efficient compressed model. Through the REAP (Routing-Weighted Expert Activation Pruning) method, the model is compressed from 230B parameters to 172B parameters while maintaining performance, with a 25% reduction in size. It is suitable for resource-constrained environments, local deployment, and academic research.
Natural Language Processing
Gguf