Models / Qwen
Chat
LLM

Qwen3-Next-80B-A3B-Thinking

Next-generation reasoning model with extreme efficiency

About model

Advanced Reasoning Engine:
Qwen3-Next Thinking features the same highly sparse MoE architecture but specialized for complex reasoning tasks. Supports only thinking mode with automatic tag inclusion, delivering exceptional analytical performance while maintaining extreme efficiency with 10x+ higher throughput on long contexts and may generate longer thinking content than predecessors.

  • Model card

    Architecture Overview:
    • 48 layers with 2048 hidden dimension and hybrid layout pattern
    • 512 total experts with 10 activated and 1 shared expert per MoE layer
    • Multi-token prediction mechanism for faster analytical processing

    Thinking Mode Optimization:
    • Supports only thinking mode with automatic tag inclusion
    • Specialized post-training for complex reasoning chains on 15T tokens
    • May generate longer thinking content for comprehensive analysis

    Performance Characteristics:
    • 262K native context length, extensible to 1M tokens with YaRN scaling
    • More than 10x higher throughput on contexts over 32K tokens
    • SGLang and vLLM deployment support with Multi-Token Prediction

  • Applications & use cases

    Research & Analysis:
    • Scientific research and hypothesis generation with detailed reasoning
    • Complex data analysis and pattern recognition
    • Academic writing and literature review with analytical depth

    Problem Solving:
    • Engineering design challenges requiring multi-step analysis
    • Strategic business planning and decision-making support
    • Mathematical problem solving and proof generation

    Professional Applications:
    • Legal case analysis and argument construction
    • Medical diagnosis assistance with reasoning transparency
    • Financial modeling and risk assessment with detailed rationale

Related models
  • Model provider
    Qwen
  • Type
    Chat
    LLM
  • Main use cases
    Chat
    Reasoning
  • Deployment
    On-Demand Dedicated
  • Parameters
    80B
  • Activated parameters
    3B
  • Context length
    256K
  • Input price

    $0.15 / 1M tokens

  • Output price

    $1.50 / 1M tokens

  • Input modalities
    Text
  • Output modalities
    Text
  • Released
    September 9, 2025
  • Last updated
    February 24, 2026
  • External link
  • Category
    Chat