Mistral, a French artificial intelligence model manufacturer, quickly returned to the open-source route after facing criticism from part of the open-source community for its latest closed-source model, Medium3. Recently, the company collaborated with All Hands AI, an open-source startup (creator of OpenDevin), to launch a brand-new open-source language model called Devstral. This lightweight model, which has 24 million parameters and is specifically designed for agent AI software development, even outperforms many competitors with billions of parameters in specific benchmark tests, including some closed-source models.
Different from traditional large language models (LLMs) that focus on code completion or independent function generation, Devstral has been optimized to act as a complete software engineering agent. This means it can understand cross-file contexts, browse large codebases, and solve real-world software development problems. More importantly, Devstral is released under the permissive Apache2.0 license, allowing developers and organizations to freely deploy, modify, and commercialize the model.
Baptiste Rozière, a research scientist at Mistral AI, emphasized that they want to provide developers with an open-source tool that can run privately locally and be modified according to needs. The Apache2.0 license gives users great freedom.
Based on the Success of Codestral Iterations
Devstral is the latest advancement in Mistral's code-centric Codestral series. Codestral was first introduced in May 2024, featuring 22 billion parameters and supporting over 80 programming languages. It excels in code generation and completion tasks. Its rapid iteration led to the enhanced version Codestral-Mamba based on the Mamba architecture and the latest Codestral25.01, which has been particularly favored by IDE plugin developers and enterprise users. The success of the Codestral series laid a solid foundation for the birth of Devstral, enabling it to expand from simple code completion to full agent task execution.
Impressive Performance in SWE Benchmarks
In the SWE-Bench Verified benchmark test, Devstral achieved an excellent score of 46.8%. SWE-Bench Verified is a dataset containing 500 real GitHub issues, manually validated to ensure accuracy. This performance not only leads all previously released open-source models but also surpasses several closed-source models, including GPT-4.1-mini, by more than 20 percentage points.
Rozière proudly stated that Devstral is currently the best-performing open-source model in terms of SWE-bench validation and code proxy capabilities, and surprisingly, it only has 24 million parameters and can run locally on a MacBook. Dr. Sophia Yang, developer relations director at Mistral AI, also noted on social media that Devstral performs better than many closed-source alternatives in various framework evaluations.
Devstral's outstanding performance is due to reinforcement learning and safety adjustment techniques applied to the Mistral Small3.1 base model. Rozière explained that they first selected a powerful base model and then used specialized techniques to improve its performance on the SWE-bench.
Not Just Code Generation, But Also the Foundation of AI Software Development Agents
Devstral's goal is not just to generate code but to integrate into agent frameworks like OpenHands, SWE-Agent, and OpenDevin. These frameworks allow Devstral to interact with test cases, navigate source code files, and execute multi-step tasks across projects. Rozière revealed that Devstral will be released alongside OpenDevin, which provides a scaffold for code agents, serving as the backend for developer models.
To ensure model reliability, Mistral rigorously tested Devstral in different codebases and internal workflows to avoid overfitting to the SWE-bench benchmark. They trained the model using data from non-SWE-bench datasets and verified its performance across different frameworks.
Efficient Deployment and Commercially Friendly Open Source License
Devstral’s compact architecture with 24 million parameters allows developers to run it easily locally, whether on machines equipped with a single RTX4090 GPU or Mac computers with 32GB of memory. This is particularly attractive for applications focusing on privacy protection and requiring deployment on edge devices. Rozière said the target users include developers and enthusiasts who are enthusiastic about local and privatized operations, who can even use it in environments without internet access.
Besides performance and portability, Devstral’s Apache2.0 license also provides great convenience for commercial applications. This license allows unrestricted use, adaptation, and distribution, including in proprietary products, significantly lowering the threshold for enterprise adoption.
Devstral has a 128,000-token context window and uses the Iron Fist tokenizer with a vocabulary size of 131,000. It supports deployment through mainstream open-source platforms such as Hugging Face, Ollama, Kaggle, LM Studio, and Unsloth, and is well-compatible with libraries like vLLM, Transformers, and Mistral Inference.
API and Local Deployment Hand in Hand
Developers can access Devstral via Mistral’s Le Platforme API, with the model name devstral-small-2505, priced at $0.10 per million input tokens and $0.30 per million output tokens. For users who prefer local deployment, support for frameworks like OpenHands enables immediate integration with codebases and agent workflows. Rozière shared how he uses Devstral to complete small development tasks like updating package versions or modifying tokenization scripts and expressed appreciation for its ability to precisely locate and modify code.
Although Devstral is currently released in a research preview version, Mistral and All Hands AI are already developing more powerful and larger subsequent models. Rozière believes that the gap between small and large models is rapidly narrowing, and Devstral’s outstanding performance can already rival some much larger competitors.
With its excellent performance benchmarks, permissive open-source license, and features optimized for agent design, Devstral is not just a powerful code generation tool but also a key foundational model for building autonomous software engineering systems.