Devstral Small 2505
24B coding model by Mistral AI & All Hands AI built for advanced agentic code tasks, topping SWE-bench scores.
About model
Devstral Small 2505 excels at agentic coding tasks, making it suitable for software engineering agents, with a long context window of up to 128k tokens and a compact size of 24 billion parameters. It achieves remarkable performance on SWE-bench, outperforming prior open-source models. Designed for local deployment and on-device use.
To run this model, you first need to deploy it on a Dedicated Endpoint.
Model card
Devstral is an agentic LLM for software engineering tasks built under a collaboration between Mistral AI and All Hands AI 🙌. Devstral excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench which positionates it as the #1 open source model on this benchmark.
It is finetuned from Mistral-Small-3.1, therefore it has a long context window of up to 128k tokens. As a coding agent, Devstral is text-only and before fine-tuning from
Mistral-Small-3.1the vision encoder was removed.For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
Learn more about Devstral in this blog post.
Key Features
- Agentic coding: Devstral is designed to excel at agentic coding tasks, making it a great choice for software engineering agents.
- lightweight: with its compact size of just 24 billion parameters, Devstral is light enough to run on a single RTX 4090 or a Mac with 32GB RAM, making it an appropriate model for local deployment and on-device use.
- Apache 2.0 License: Open license allowing usage and modification for both commercial and non-commercial purposes.
- Context Window: A 128k context window.
- Tokenizer: Utilizes a Tekken tokenizer with a 131k vocabulary size.
Benchmark Results
SWE-Bench
Devstral achieves a score of 46.8% on SWE-Bench Verified, outperforming prior open-source SoTA by 6%.
When evaluated under the same test scaffold (OpenHands, provided by All Hands AI 🙌), Devstral exceeds far larger models such as Deepseek-V3-0324 and Qwen3 232B-A22B.

- TypeCodeChat
- Main use casesChatCoding Agents
- DeploymentOn-Demand DedicatedMonthly Reserved
- Parameters24B
- Context length128k
- Input modalitiesText
- Output modalitiesText
- ReleasedMay 12, 2025
- External link
- CategoryChat