MiniMax launched its flagship model MiniMax-M3, switching from per-request billing to token-based billing without adequate prior communication. The token consumption far exceeded expectations, rapidly depleting heavy users' quotas, sparking strong developer backlash. On June 2nd evening, MiniMax issued an apology, acknowledging insufficient communication.....
The European Commission recently announced new minimum energy efficiency standards for data centers to address rising energy pressure from surging electricity demand. By 2030, the EU's total data center capacity is expected to increase from 12 GW last year to 28 GW, with electricity consumption exceeding 2.5% of the EU's total power. This move aims to promote energy savings and support sustainable digital services.....
AI startup MiniMax releases flagship model M3, achieving 59% in near-real software engineering tests, surpassing GPT-5.5 and approaching Opus 4.7, with million-level context and native multimodality, sparking controversy.....
MiniMax M3 model officially launched, integrated with JD Cloud JoyBuilder platform. Key highlight: significantly improved inference performance via self-developed framework, PD separation, KV Cache, and speculative sampling for enhanced efficiency.....
Agent Minimax IO performs security verification to prevent malicious automated programs.
MiniMax Agent is an intelligent AI companion that provides support using advanced multimodal technology.
Scira is a minimalist AI-powered search engine that helps users find information on the internet.
Provides a mini-map overview within ChatGPT conversations for quick browsing and navigation.
Openai
$2.8
Input tokens/M
$11.2
Output tokens/M
1k
Context Length
$7.7
$30.8
200
$0.4
-
128
$1.75
$14
400
$56
$0.7
Alibaba
$2
32
Xai
$2.1
$3.5
Minimax
$1.6
$16
Baichuan
Stepfun
$1
$8
$38
$120
16
Iflytek
8
Arko007
Zenyx_114M-Tiny-Edu-Instruct is an experimental small instruction fine-tuned language model with approximately 114 million parameters. It is built on the TinyEdu-50M base model, pre-trained on the FineWeb-Edu dataset, and fine-tuned on a mixed dataset of OpenHermes-2.5 and CodeFeedback-Filtered. This model aims to explore the limits of instruction fine-tuning under a minimal architecture and verify that the loss converges to approximately 1.04.
bartowski
This is a quantized version of the VibeStudio MiniMax-M2-THRIFT model, generated using the llama.cpp tool and a specific dataset. It provides GGUF files of various quantization types and supports running in LM Studio or projects based on llama.cpp.
This is a large language model with 172B parameters obtained by uniformly pruning 25% of the experts in MiniMax-M2 using the REAP method. It has been optimized and quantized specifically for llama.cpp, supporting multiple quantization levels and can run in LM Studio or projects based on llama.cpp.
This is a large language model with 139B parameters obtained by uniformly pruning 40% of the experts in MiniMax-M2 using the REAP method. It adopts the GLM architecture and Mixture of Experts (MoE) technology, and undergoes various quantization processes through llama.cpp, suitable for text generation tasks.
DevQuasar
This project provides a quantized version of the cerebras/MiniMax-M2-REAP-172B-A10B model, aiming to make knowledge accessible to the public. This is a large language model with 172 billion parameters, which has been optimized and quantized to reduce deployment costs and improve inference efficiency.
noctrex
This is the MXFP4_MOE quantized version of the MiniMax-M2-REAP-172B-A10B model, which is a memory-efficient compressed model. Through the REAP (Routing-Weighted Expert Activation Pruning) method, the model is compressed from 230B parameters to 172B parameters while maintaining performance, with a 25% reduction in size. It is suitable for resource-constrained environments, local deployment, and academic research.
cerebras
MiniMax-M2-REAP-162B-A10B is an efficiently compressed version of MiniMax-M2. It uses the REAP (Routing Weighted Expert Activation Pruning) method to reduce the model size by 30% while maintaining almost the same performance, compressing from 230B parameters to 162B parameters, significantly reducing the memory requirement.
MiniMax-M2-REAP-172B-A10B is a memory-efficient compressed variant of MiniMax-M2. It uses the REAP expert pruning method, reducing the model size by 25% from 230B parameters to 172B parameters while maintaining almost the same performance.
cyankiwi
MiniMax-M2 AWQ - INT4 is a quantized version based on the MiniMax-M2 model. It uses INT4 quantization technology to significantly reduce memory usage and improve inference efficiency while ensuring performance. This model performs excellently in coding and agent tasks and has outstanding comprehensive performance.
This is the MXFP4_MOE quantized version of the MiniMax-M2-THRIFT model. It has been compressed based on the original model, including 25% expert pruning (from 256 to 192) and setting top_k=8. Meanwhile, it retains the characteristics of the encoding model and can be used for text generation tasks.
catalystsec
This project performs 4-bit quantization on the MiniMax-M2 model using the DWQ (Dynamic Weight Quantization) method with the help of the mlx-lm library. This model is a lightweight version of MiniMax-M2, significantly reducing the model size while maintaining good performance.
unsloth
MiniMax-M2 is a small Mixture of Experts (MoE) model built specifically for maximizing coding and agent workflows. It has a total of 230 billion parameters, with 1 billion active parameters. This model excels in coding and agent tasks while maintaining strong general intelligence. It is compact, fast, and cost-effective.
anikifoss
This project is a high-quality HQ4_K quantization of the MiniMax-M2 model, optimized specifically for text generation tasks and particularly suitable for dialogue scenarios. This quantized version does not use imatrix and maintains the model's performance.
This project is the result of quantizing the MiniMax-M2 model to 3 bits using the mlx-lm library through Dynamic Weight Quantization (DWQ). It can efficiently perform text generation tasks under resource-constrained conditions, providing a more lightweight solution for related applications.
This project quantizes the MiniMax-M2 model of MiniMaxAI. Using the llama.cpp tool, it provides model files of multiple quantization types for users with different needs, facilitating the efficient operation of the model under different hardware conditions.
This project is a quantized version based on the MiniMaxAI/MiniMax-M2 model, aiming to make knowledge accessible to the public. It provides multiple model versions with different quantization levels and shows the perplexity performance indicators of each version.
redponike
MiniMax-M2 is a Mixture of Experts model designed for efficient coding and agent workflows. It has a total of 230 billion parameters and 10 billion active parameters. This model performs excellently in coding and agent tasks, and features low latency, low cost, and high throughput, effectively improving work efficiency.
This is the MXFP4_MOE quantized version of the MiniMax-M2 model. It is re-quantized based on the version with the chat template fixed by unsloth, enabling more efficient use of the MiniMax-M2 model's capabilities in specific scenarios. This is a coding model that needs to be used with the latest llama.cpp.
bullerwins
MiniMax-M2 is a small Mixture of Experts (MoE) model built specifically for maximizing coding and agent workflows. It has a total of 230 billion parameters, with only 10 billion parameters activated. It performs excellently in coding and agent tasks while maintaining strong general intelligence, featuring compactness, speed, and cost-effectiveness.
mlx-community
This is the MLX format conversion version of the MiniMax-M2 model, converted from the original model using mlx-lm 0.28.1. It supports 8-bit quantization and an optimized configuration with a group size of 32, and is optimized for running on Apple Silicon devices.
The MiniMax Model Context Protocol (MCP) is an official server that supports interaction with powerful text-to-speech, video/image generation APIs, and is suitable for various client tools such as Claude Desktop and Cursor.
MiniMax's official Model Context Protocol (MCP) server supports interactions with APIs such as text-to-speech, video/image generation.
A minimalist MCP server that allows querying the status via the Tailscale CLI on macOS
A minimalist MCP server for interacting with Bauplan data tables and executing queries. Supports functions such as listing table structures, getting table schemas, and running SQL queries.
A minimalist WeChat article reader MCP service that enables large models to obtain and analyze the content of WeChat official account articles through browser simulation and content extraction technology.
An efficient task manager that supports multiple file formats (Markdown, JSON, YAML), provides powerful search, filtering, and organization functions, and is designed to minimize tool confusion and maximize LLM budget efficiency.
A minimal Anthropic MCP protocol client and server implementation based on Python and Pip for debugging reference in VSCode and Windows environments.
A minimalist MCP server implemented in bash script, supporting the call of a mathematical addition tool and capable of interacting with LLM through the JSON - RPC protocol.
Skill Retriever is an MCP server based on a graph database, specifically designed for the retrieval and installation of Claude Code components. It automatically indexes component repositories on GitHub, returns the minimum correct set of components and their dependencies based on task descriptions through semantic search and graph traversal algorithms, and supports security scanning and automatic synchronization and updates.
MiniMax MCP JS is a MiniMax Model Context Protocol toolkit implemented in JavaScript/TypeScript, providing functions such as text-to-speech, image generation, video generation, and voice cloning, and supporting multiple configuration methods and transmission modes.
A minimal MCP server project demonstrating how to implement GitHub OAuth authentication, including guides on GitHub App registration and testing.
A minimal server/client application implementation based on the Model Context Protocol (MCP) and Azure OpenAI, including the FastMCP server, Playwright testing framework, and MCP-LLM Bridge conversion function.
A minimalist Model Context Protocol (MCP) server based on AWS Lambda and API Gateway, deployed using the Serverless Framework, supporting local development and testing.
A minimal example of an MCP server demonstrating how to implement Entra ID authentication, using the HTTP+SSE transmission protocol.
MCP2HTTP is a minimal transport adapter for connecting MCP clients using stdio to stateless HTTP MCP servers.
A customized MCP server by MiniMax for Coding Plan users, providing AI-driven web search and image analysis tools, optimized specifically for the code development workflow, and can be integrated into MCP clients such as Claude Desktop and Cursor to enhance the programming experience.
MiniMax MCP JS is a MiniMax MCP protocol toolset implemented in JavaScript/TypeScript, providing functions such as image generation, video generation, and text-to-speech, and supporting interaction with MCP-compatible clients.
A minimized MCP service implementation for Cursor IDE based on the official MCP SDK, used for rapid startup or experimentation.
This project uses an optimized GitHub Actions workflow, designed for open-source projects, minimizing resource usage while maintaining quality. It includes functions such as rapid PR checks, a complete CI/CD pipeline, security scans, and automated releases.
The Minimax MCP Tools is an MCP server implementation integrating the Minimax API, providing AI image generation and text-to-speech functions, and supporting seamless integration with the Windsurf editor.