Qwen3-Coder-Next
State-of-the-art coding agent with ultra-efficient 3B active inference.
About model
Qwen3-Coder-Next is an open-weight language model designed specifically for coding agents. With only 3B activated parameters (80B total), it achieves performance comparable to models with 10–20x more active parameters, making it highly cost-effective for production agent deployment. Through an elaborate training recipe, Qwen3-Coder-Next excels at long-horizon reasoning, complex tool usage, and recovery from execution failures, ensuring robust performance in dynamic coding tasks. With 256K context length and advanced tool calling capabilities, it delivers state-of-the-art agentic coding on Together AI's production infrastructure.
74.2%
Production-level autonomous coding
3B
Performing like 30-60B models
10-20x
Cost savings for agent workloads
- Ultra-Efficient Architecture: Only 3B activated parameters (80B total) achieving performance comparable to models with 10-20x more active parameters—highly cost-effective for production agent deployment
- Advanced Agentic Capabilities: Long-horizon reasoning, complex tool usage, and recovery from execution failures—ensuring robust performance in dynamic coding workflows
- Leading Coding Performance: 74.2% SWE-Bench Verified, 63.7% SWE-Bench Multilingual, 69.9% Aider—state-of-the-art agentic coding on Together AI
- Production-Ready Infrastructure: 256K context with 99.9% SLA on the AI Native Cloud— available on serverless and dedicated endpoints
API usage
Endpoint:
Model card
Architecture Overview:
• Mixture-of-Experts (MoE) architecture with 80B total parameters and 3B activated parameters
• 48 layers with hybrid layout: 12 × (3 × (Gated DeltaNet → MoE) → 1 × (Gated Attention → MoE))
• Gated Attention: 16 attention heads for Q, 2 for KV, head dimension 256, rotary position embedding dimension 64
• Gated DeltaNet: 32 linear attention heads for V, 16 for QK, head dimension 128
• 512 experts with 10 activated per token, 1 shared expert, expert intermediate dimension 512
• Hidden dimension 2048 with 79B non-embedding parameters
• 256K context length (262,144 tokens natively)
• Non-thinking mode only—does not generate thinking blocks
Training Methodology:
• Pretraining and post-training stages optimized for coding agents
• Elaborate training recipe for long-horizon reasoning and complex tool usage
• Specialized training for execution failure recovery and dynamic coding tasks
• Trained for seamless integration with diverse development environments
Performance Characteristics:
• Ultra-efficient: 3B activated parameters achieving performance comparable to models with 30-60B active parameters
• Leading agentic coding: 74.2% SWE-Bench Verified (w/ SWE-Agent), 63.7% SWE-Bench Multilingual, 44.3% SWE-Bench Pro
• Strong autonomous coding: 69.9% Aider, 39.3% Terminal-Bench 2.0 (w/ Terminus-2 json)
• Outperforms larger models: beats DeepSeek-V3.2 (37B active), GLM-4.7 (32B active), MiniMax M2.1 (10B active)
• Cost-effective deployment: 10-20x parameter efficiency advantage for agent workloads
• Advanced tool calling capabilities with native support for complex function orchestration
Applications & use cases
Agentic Software Development:
• Production-level autonomous coding: 74.2% SWE-Bench Verified, 63.7% SWE-Bench Multilingual, 44.3% SWE-Bench Pro
• Long-horizon reasoning across complex codebases with 256K context
• Execution failure recovery—adapts when plans don't work as expected
• Multi-step development workflows with precision tool invocation
• Repository-scale navigation and bug fixing
• Code review, refactoring, and optimization tasks
Advanced Tool Calling & Orchestration:
• Native support for complex function calling and tool orchestration
• Dynamic tool selection and sequential execution
• Error handling and recovery from tool execution failures
• Multi-tool workflows for comprehensive development tasks
• Function definition, invocation, and result processing
Autonomous Coding Assistance:
• Code generation from natural language descriptions
• Automated testing and test case generation
• Documentation generation and code commenting
• Debugging and error diagnosis with suggested fixes
• 69.9% Aider performance—strong autonomous coding assistance
Cost-Effective Agent Deployment:
• Ultra-efficient: 3B activated parameters performing like 30-60B models
• 10-20x parameter efficiency advantage reduces infrastructure costs
• Highly cost-effective at $0.50/$1.20 for production agent workloads
• Scales from prototyping to production without cost explosion
• Ideal for startups and enterprises deploying coding agents at scale
Development Workflow Automation:
• End-to-end feature implementation from specification to working code
• Automated code migration and refactoring across files
• Batch processing of code changes across repositories
• CI/CD pipeline integration for automated code generation
• Technical debt reduction through automated refactoring
- TypeCodeChatLLM
- Main use casesChatCoding Agents
- SpeedVery High
- IntelligenceHigh
- DeploymentServerlessMonthly Reserved
- Endpoint
- Parameters79.7B
- Context length262K
- Input price
$0.50 / 1M tokens
- Output price
$1.20 / 1M tokens
- Input modalitiesText
- Output modalitiesText
- ReleasedFebruary 1, 2026
- Last updatedFebruary 2, 2026
- Quantization levelFP8
- External link
- CategoryCode