Ministral 3 8B Instruct 2512
Balanced 8B multimodal model for versatile assistants, agents, and multilingual understanding.
About model
Ministral 3 8B Instruct is Mistral AI's balanced 8B-class multimodal assistant, pairing an 8.4B language backbone with a 0.4B vision encoder for everyday text–image reasoning. With a 256K token context window, it handles long conversations, multi-document analysis, and tool-augmented workflows while staying fast and cost-efficient for broad deployment.
8.8B
8.4B language core plus 0.4B vision encoder for unified multimodal reasoning.
256K
Long-horizon chats, multi-document tasks, and extended tool traces in a single run.
Agents
Native function calling and structured outputs for reliable assistants, copilots, and workflows.
- Everyday Multimodal Assistant: General-purpose chat, Q&A, and document/image understanding with a strong instruction-following backbone
- Agent-Ready Outputs: Structured responses and tool calls suited for orchestration in agents and copilots
- Multilingual & Code: Solid multilingual and coding ability for global products and developer tooling
- Consistent Family Behavior: Shares prompting patterns with Ministral 3 3B and 14B so you can swap models without rewriting prompts
Model | AIME 2025 | GPQA Diamond | HLE | LiveCodeBench | MATH500 | SWE-bench verified |
|---|---|---|---|---|---|---|
Ministral 3 8B Instruct 2512 | 78.7% | 66.8% | 61.6% | Related open-source models | Competitor closed-source models | |
90.5% | 34.2% | 78.7% | ||||
83.3% | 24.9% | 99.2% | 62.3% | |||
76.8% | 96.4% | 48.9% | ||||
49.2% | 2.7% | 32.3% | 89.3% | 31.0% |
Model card
Architecture overview:
• Dense 8.4B parameter language backbone paired with a 0.4B vision encoder for unified text and image IO.
• 256K token context window shared with the rest of the Ministral 3 family for consistent long-context behavior.
• Instruction-tuned head optimized for assistants, agents, and structured outputs such as JSON and tool calls.Training and performance:
• Trained on diverse multilingual, code, and web-style corpora to provide robust coverage in the 8B tier.
• Instruction tuning emphasizes helpfulness, harmlessness, and adherence to system prompts over raw perplexity.
• Positioned as a mid-size workhorse that delivers near-frontier quality for many assistant and analytic tasks at lower cost and latency.Applications & use cases
Assistants and agents:
• General-purpose chat assistants for support, operations, and knowledge work where responsiveness and quality must balance cost.
• Multimodal internal copilots that combine screenshots, documents, and text queries for debugging, analysis, and investigation.
• Agentic systems that plan, call tools, and synthesize results into natural-language recommendations or summaries.Product and platform use cases:
• Embedded chat and help widgets inside SaaS products and dashboards.
• RAG-style knowledge interfaces over product docs, knowledge bases, and semi-structured data using the 256K context.
• Content generation, rewriting, translation, and summarization workflows that need solid multilingual quality.
- TypeChat
- Main use casesChatSmall & Fast
- DeploymentOn-Demand Dedicated
- Parameters8.9B
- Context length256K
- Input modalitiesText
- Output modalitiesText
- ReleasedOctober 31, 2025
- Quantization levelFP8
- External link
- CategoryChat