Arcee AI Trinity Mini
Advanced sparse MoE model for efficient inference on Together AI.
About model
Trinity Mini brings frontier-level language understanding to your applications without frontier costs. This 26B sparse mixture-of-experts model activates just 3B parameters per token, delivering exceptional reasoning, tool use, and multi-turn conversation capabilities across a 128K context window. Whether you're building conversational AI, agentic workflows, or production systems requiring long-context understanding, Trinity Mini offers the efficiency and performance to scale from prototype to production seamlessly.
Model | AIME 2025 | GPQA Diamond | HLE | LiveCodeBench | MATH500 | SWE-bench verified |
|---|---|---|---|---|---|---|
Arcee AI Trinity Mini | 58.6% | 92.1% | Related open-source models | Competitor closed-source models | ||
90.5% | 34.2% | 78.7% | ||||
83.3% | 24.9% | 99.2% | 62.3% | |||
76.8% | 96.4% | 48.9% | ||||
49.2% | 2.7% | 32.3% | 89.3% | 31.0% |
API usage
Endpoint:
Model card
Architecture Overview:
• Sparse mixture-of-experts (MoE) architecture with 26B total parameters and 3B activated per token
• Efficient attention mechanism that reduces memory and compute requirements while preserving long-context coherence
• 128K-token context window supporting extended document processing and multi-turn interactions
Training Methodology:
• Trained with continuous reinforcement learning techniques for ongoing capability improvements
• Built by Arcee AI's collaborative research team focused on delivering best-in-class per-parameter performance
• Optimized for multi-turn conversations, tool use, and structured outputs
Performance Characteristics:
• Strong context utilization that fully leverages long inputs for coherent multi-turn reasoning
• Reliable function and tool calling capabilities for agent workflows
• High inference efficiency generating tokens rapidly while minimizing compute
• Outstanding price-to-performance ratio compared to dense models of similar capability
Applications & use cases
Conversational AI Applications:
• Multi-turn customer support chatbots with long conversation history
• Virtual assistants with tool integration and function calling
• Interactive documentation and knowledge base systems
Agentic Workflows:
• Multi-step agent systems requiring tool use and reasoning
• Workflow automation with structured output generation
• RAG systems with extended context understanding
Enterprise Integration:
• Cost-efficient production deployments via Together AI APIs
• Internal tooling with natural language interfaces
• Document analysis and processing pipelines with 128K context support
- Model providerArcee AI
- TypeChat
- Main use casesChatSmall & Fast
- DeploymentServerlessMonthly Reserved
- Endpoint
- Parameters26B
- Activated parameters3B
- Context length32.7K
- Input price
$0.05 / 1M tokens
- Output price
$0.15 / 1M tokens
- Input modalitiesText
- Output modalitiesText
- ReleasedDecember 1, 2025
- External link
- CategoryChat