Models / Qwen
Reasoning
Vision
Video
Chat

Qwen3.6-Plus

Multimodal agentic coding model with 1M context and visual reasoning

About model

Qwen3.6-Plus is Qwen's multimodal agentic model built on a hybrid architecture combining efficient linear attention with sparse MoE routing and a 1M token context window. It delivers strong agentic coding performance with 78.8% SWE-Bench Verified and 61.6 Terminal-Bench 2.0, alongside 90.4% GPQA Diamond for reasoning and 86.0% MMMU for multimodal understanding. The model supports visual coding from UI screenshots to production code, video understanding, and thinking mode with preserve_thinking for maintaining reasoning context across multi-turn agent sessions.

GPQA Diamond

90.40%

Scientific and technical reasoning

SWE-Bench Verified

78.80%

Repository-level autonomous coding

Terminal-Bench 2.0

61.6

Terminal automation and agentic coding

Model key capabilities
  • Agentic Coding: 78.8% SWE-Bench Verified and 61.6 Terminal-Bench 2.0 with strong frontend development for 3D scenes, games, and interactive web applications
  • Visual Coding & Multimodal: Generate production code from UI screenshots and design mockups, with 91.2 OmniDocBench for document understanding and 86.0% MMMU for multimodal reasoning
  • Reasoning with Thinking Mode: 90.4% GPQA Diamond with preserve_thinking for maintaining full reasoning context across multi-turn agentic sessions
  • 1M Context: Process entire codebases, long documents, and complex multi-turn workflows on a hybrid linear attention + sparse MoE architecture
Performance benchmarks

Model

AIME 2025

GPQA Diamond

HLE

LiveCodeBench

MATH500

SWE-bench verified

90.40%

87.10%

78.80%

Related open-source models

Competitor closed-source models

Claude Opus 4.6

90.5%

34.2%

78.7%

OpenAI o3

83.3%

24.9%

99.2%

62.3%

OpenAI o1

76.8%

96.4%

48.9%

GPT-4o

49.2%

2.7%

32.3%

89.3%

31.0%

  • API usage

    • cURL
    • Python
    • Typescript

    Endpoint:

    Qwen/Qwen3.6-Plus

    curl -X POST "https://api.together.xyz/v1/chat/completions" \
      -H "Authorization: Bearer $TOGETHER_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "Qwen/Qwen3.6-Plus",
        "messages": [
          {
            "role": "user",
            "content": "What are some fun things to do in New York?"
          }
        ]
    }'
    
    from together import Together
    
    client = Together()
    
    response = client.chat.completions.create(
      model="Qwen/Qwen3.6-Plus",
      messages=[
        {
          "role": "user",
          "content": "What are some fun things to do in New York?"
        }
      ]
    )
    print(response.choices[0].message.content)
    
    import Together from 'together-ai';
    const together = new Together();
    
    const completion = await together.chat.completions.create({
      model: 'Qwen/Qwen3.6-Plus',
      messages: [
        {
          role: 'user',
          content: 'What are some fun things to do in New York?'
         }
      ],
    });
    
    console.log(completion.choices[0].message.content);
    
  • Model card

    Architecture Overview:
    • Hybrid architecture combining efficient linear attention with sparse mixture-of-experts routing
    • 1M token context window by default
    • Multimodal: text, image, and video input with visual coding and visual agent capabilities
    • Thinking mode with preserve_thinking option for maintaining full reasoning context across multi-turn agentic tasks
    • Reduces redundant reasoning and overall token consumption in agent scenarios when preserve_thinking is enabled

    Training Methodology:
    • Built on the Qwen3.5-Plus foundation with targeted improvements for agentic coding, frontend development, and multimodal perception
    • Deep integration of reasoning, memory, and execution capabilities for agentic workflows
    • Optimized for complex frontend development including 3D scenes, games, and interactive web applications
    • Enhanced multimodal training for document understanding, chart parsing, UI understanding, and video reasoning

    Performance Characteristics:
    • 90.4% GPQA Diamond for scientific reasoning
    • 87.1% LiveCodeBench v6 for code generation
    • 78.8% SWE-Bench Verified for autonomous software engineering
    • 61.6 Terminal-Bench 2.0 for terminal automation and agentic coding
    • 91.2 OmniDocBench for document understanding
    • 86.0% MMMU for multimodal understanding
    • Strong multilingual performance across coding and reasoning benchmarks

  • Prompting

    Together AI API Access:
    • Access Qwen3.6-Plus via Together AI APIs using the endpoint Qwen/Qwen3.6-Plus
    • Authenticate using your Together AI API key in request headers
    • Supports thinking mode with optional preserve_thinking for maintaining reasoning context across agent turns
    • Multimodal input: text, image, and video
    • $0.50 input / $3.00 output per million tokens
    • Available on Together AI serverless infrastructure

  • Applications & use cases

    Agentic Coding:
    • Repository-level problem solving with 78.8% SWE-Bench Verified
    • Terminal automation and complex task execution with 61.6 Terminal-Bench 2.0
    • Frontend web development: 3D scenes, games, interactive applications with strong visual fidelity
    • Compatible with coding agent frameworks for production development workflows

    Visual Coding & Multimodal Agents:
    • Generate frontend code from UI screenshots, product prototypes, and design mockups
    • Visual reasoning for document understanding (91.2 OmniDocBench), chart parsing, and UI analysis
    • Video understanding with temporal reasoning across frames
    • Perception-to-action loop for GUI agent scenarios

    Reasoning & Long Context:
    • 90.4% GPQA Diamond for scientific and technical reasoning
    • Thinking mode with preserve_thinking for consistent decision-making across long agentic sessions
    • 1M token context for processing entire codebases, long documents, and complex multi-turn workflows
    • Multilingual reasoning and tool-calling across diverse domains

Related models
  • Model provider
    Qwen
  • Type
    Reasoning
    Vision
    Video
    Chat
  • Main use cases
    Reasoning
  • Features
    Function Calling
    JSON Mode
  • Deployment
    Serverless
  • Context length
    1M
  • Input price

    $0.50 / 1M tokens

  • Output price

    $3.00 / 1M tokens

  • Input modalities
    Text
    Image
    Video
  • Output modalities
    Text
  • Released
    April 1, 2026
  • Category
    Chat