Qwen3.6-Plus
Multimodal agentic coding model with 1M context and visual reasoning
About model
Qwen3.6-Plus is Qwen's multimodal agentic model built on a hybrid architecture combining efficient linear attention with sparse MoE routing and a 1M token context window. It delivers strong agentic coding performance with 78.8% SWE-Bench Verified and 61.6 Terminal-Bench 2.0, alongside 90.4% GPQA Diamond for reasoning and 86.0% MMMU for multimodal understanding. The model supports visual coding from UI screenshots to production code, video understanding, and thinking mode with preserve_thinking for maintaining reasoning context across multi-turn agent sessions.
90.40%
Scientific and technical reasoning
78.80%
Repository-level autonomous coding
61.6
Terminal automation and agentic coding
- Agentic Coding: 78.8% SWE-Bench Verified and 61.6 Terminal-Bench 2.0 with strong frontend development for 3D scenes, games, and interactive web applications
- Visual Coding & Multimodal: Generate production code from UI screenshots and design mockups, with 91.2 OmniDocBench for document understanding and 86.0% MMMU for multimodal reasoning
- Reasoning with Thinking Mode: 90.4% GPQA Diamond with preserve_thinking for maintaining full reasoning context across multi-turn agentic sessions
- 1M Context: Process entire codebases, long documents, and complex multi-turn workflows on a hybrid linear attention + sparse MoE architecture
Model | AIME 2025 | GPQA Diamond | HLE | LiveCodeBench | MATH500 | SWE-bench verified |
|---|---|---|---|---|---|---|
Qwen3.6-Plus | 90.40% | 87.10% | 78.80% | Related open-source models | Competitor closed-source models | |
90.5% | 34.2% | 78.7% | ||||
83.3% | 24.9% | 99.2% | 62.3% | |||
76.8% | 96.4% | 48.9% | ||||
49.2% | 2.7% | 32.3% | 89.3% | 31.0% |
API usage
Endpoint:
Model card
Architecture Overview:
• Hybrid architecture combining efficient linear attention with sparse mixture-of-experts routing
• 1M token context window by default
• Multimodal: text, image, and video input with visual coding and visual agent capabilities
• Thinking mode with preserve_thinking option for maintaining full reasoning context across multi-turn agentic tasks
• Reduces redundant reasoning and overall token consumption in agent scenarios when preserve_thinking is enabled
Training Methodology:
• Built on the Qwen3.5-Plus foundation with targeted improvements for agentic coding, frontend development, and multimodal perception
• Deep integration of reasoning, memory, and execution capabilities for agentic workflows
• Optimized for complex frontend development including 3D scenes, games, and interactive web applications
• Enhanced multimodal training for document understanding, chart parsing, UI understanding, and video reasoning
Performance Characteristics:
• 90.4% GPQA Diamond for scientific reasoning
• 87.1% LiveCodeBench v6 for code generation
• 78.8% SWE-Bench Verified for autonomous software engineering
• 61.6 Terminal-Bench 2.0 for terminal automation and agentic coding
• 91.2 OmniDocBench for document understanding
• 86.0% MMMU for multimodal understanding
• Strong multilingual performance across coding and reasoning benchmarks
Prompting
Together AI API Access:
• Access Qwen3.6-Plus via Together AI APIs using the endpoint Qwen/Qwen3.6-Plus
• Authenticate using your Together AI API key in request headers
• Supports thinking mode with optional preserve_thinking for maintaining reasoning context across agent turns
• Multimodal input: text, image, and video
• $0.50 input / $3.00 output per million tokens
• Available on Together AI serverless infrastructure
Applications & use cases
Agentic Coding:
• Repository-level problem solving with 78.8% SWE-Bench Verified
• Terminal automation and complex task execution with 61.6 Terminal-Bench 2.0
• Frontend web development: 3D scenes, games, interactive applications with strong visual fidelity
• Compatible with coding agent frameworks for production development workflows
Visual Coding & Multimodal Agents:
• Generate frontend code from UI screenshots, product prototypes, and design mockups
• Visual reasoning for document understanding (91.2 OmniDocBench), chart parsing, and UI analysis
• Video understanding with temporal reasoning across frames
• Perception-to-action loop for GUI agent scenarios
Reasoning & Long Context:
• 90.4% GPQA Diamond for scientific and technical reasoning
• Thinking mode with preserve_thinking for consistent decision-making across long agentic sessions
• 1M token context for processing entire codebases, long documents, and complex multi-turn workflows
• Multilingual reasoning and tool-calling across diverse domains
- TypeReasoningVisionVideoChat
- Main use casesReasoning
- FeaturesFunction CallingJSON Mode
- DeploymentServerless
- Endpoint
- Context length1M
- Input price
$0.50 / 1M tokens
- Output price
$3.00 / 1M tokens
- Input modalitiesTextImageVideo
- Output modalitiesText
- ReleasedApril 1, 2026
- CategoryChat