Mistral AI officially launched its second generation of open-source coding model family: Devstral2 (12.3B parameter flagship version) and Devstral Small2 (2.4B parameter lightweight version). The flagship model achieved a score of 72.2% on the SWE-Bench Verified benchmark, setting a new record for open-source models; the official claims that it is seven times more cost-effective than Claude Sonnet, and also released the open-source CLI tool Mistral Vibe, which supports natural language batch code modification. Both models are now available via API, with Devstral2 priced at $0.40 per million input tokens, while the lightweight version is completely free.

Model Overview: One Large, One Small, Open-Source Dual Track  

image.png

Performance Breakthrough: 72.2% Sets New Record for Open-Source Code Models  

- SWE-Bench Verified: Devstral2 scored 72.2%, surpassing CodeLlama-70B (53.8%) and DeepSeek-Coder-33B (61.4%), only 1pp behind GPT-4-Turbo (73.2%)  

- HumanEval: 84.1% Pass@1, leading other open-source models by 6-8pp  

- Cost: Officially claimed to be seven times cheaper than Claude Sonnet; at approximately $0.40/M, it's about 1/5 of GPT-4-Turbo's cost

Open-Source Tool: Mistral Vibe —— Natural Language Batch Code Modification  

- Features: A single instruction like "convert the function to async" can automatically rewrite an entire repository, supporting diff preview and rollback  

- Engine: Locally calls Devstral Small2 (Apache 2.0), no internet connection required  

- Integration: VS Code plugin is already available, supporting one-click fixes for ESLint errors or adding unit tests

Business Strategy: Lightweight Free + Flagship API, Tiered Revenue Generation  

- Devstral Small2: Apache 2.0, commercial use, fine-tuning, and embedding allowed  

- Devstral2: Modified MIT license, requires purchase of a commercial license or use of the official API if monthly revenue exceeds $20 million, to prevent large companies from using it for free  

- API Pricing: $0.40 per million input tokens, $1.20 per million output tokens; first 30 days offer 1 million token free quota

Industry Signal: Open-Source Coding Models Enter the '70+ Club'  

- In 2024, mainstream open-source code models on SWE-Bench generally scored between 50-60%; Devstral2 raised the bar to over 72%  

- Low cost and high scores could challenge the cost-effectiveness of paid plugins like GitHub Copilot and Cursor  

- The lightweight version is completely free, potentially accelerating the adoption of "local AI coding assistants," allowing developers to run a 24B model on an RTX 4090

 Next Steps: 2025 Roadmap  

- Q1: Release Devstral2-INT4 quantized version, runs on a single A100; launch Jetson Orin edge deployment package  

- Q2: Launch 128k context version, supporting the entire codebase plus documentation as prompts  

- Q3: Launch "Vibe Cloud" —— Natural language code refactoring within the browser, billed by project

 Editor's Conclusion  

When "code generation" reaches 70+ points, the key factor shifts from "model capability" to "cost and compliance." Devstral2 brings the price down to a fraction with $0.40 per million tokens, and through the "modified MIT" license, blocks large companies from using it for free. The lightweight version is fully open-sourced, capturing the market for local deployment. For developers, the "free 24B + low-cost 1230B" combination means: write code locally and let the cloud handle heavy tasks, no longer needing to pay for Copilot subscriptions. AIbase will continue to track its quantized version and the 128k long-context release.