Qwen3 235B A22B Thinking 2507 FP8

235B-parameter MoE thinking model, 256K context, 22B activated experts, state-of-the-art reasoning performance among open-source models.

Try now

read docs

About model

Qwen3 235B A22B Thinking 2507 is a causal language model that excels in complex reasoning tasks, including logical reasoning, mathematics, and science, achieving state-of-the-art results. It features enhanced 256K long-context understanding and improved performance on academic benchmarks. Suitable for users requiring advanced thinking capabilities.

Quickstart guides

Building a RAG Workflow

Performance benchmarks

Model	GPQA Diamond	HLE	LiveCodeBench	MATH500	SWE-bench verified
Qwen3 235B A22B Thinking 2507 FP8	80.1%
Related open-source models
Competitor closed-source models
Claude Opus 4.6	90.5%	34.2%			78.7%
OpenAI o3	83.3%	24.9%		99.2%	62.3%
OpenAI o1	76.8%			96.4%	48.9%
GPT-4o	49.2%	2.7%	32.3%	89.3%	31.0%

Model card
Architecture Overview:
• Mixture-of-Experts transformer with 235B total parameters and 22B activated
• 94 layers with grouped query attention (64 for Q and 4 for KV)
• 128 experts with 8 activated experts per token for efficient inference
• Native 262,144 token context window for extensive document processing

Training Methodology:
• Advanced pretraining & post-training pipeline with thinking capability enhancement
• Specialized training for logical reasoning, mathematics, science, and coding tasks
• Constitutional AI training for alignment with human preferences
• Optimized for complex multi-step reasoning with increased thinking length

Performance Characteristics:
• State-of-the-art results among open-source thinking models on academic benchmarks
• Exceptional performance on AIME25, HMMT25, and LiveCodeBench evaluations
• Enhanced 256K long-context understanding capabilities
• Optimized inference with MoE architecture for computational efficiency
‍
Applications & use cases
Advanced Reasoning Applications:
• Complex mathematical problem solving & academic research
• Scientific analysis requiring multi-step logical reasoning
• Advanced coding challenges & algorithm development
• Academic benchmarking & competitive programming

Enterprise & Professional Use:
• High-complexity business analysis & strategic planning
• Technical documentation with detailed reasoning chains
• Expert-level consultation systems requiring deep thinking
• Research & development applications in specialized domains

Educational & Research:
• Graduate-level tutoring in STEM subjects
• Research paper analysis & academic writing assistance
• Complex problem decomposition for educational purposes
• Multilingual academic support across 119+ languages

Developer Integration:
• Tool calling capabilities with Qwen-Agent framework
• OpenAI-compatible API endpoints via SGLang or vLLM
• Integration with local deployment tools (Ollama, LMStudio, llama.cpp)
• Custom reasoning workflows for specialized applications
‍

Related models

Model specifications

Model data

Model provider
Qwen
Type
Chat
Reasoning
Main use cases
Chat
Reasoning
Function Calling
Features
Function Calling
JSON Mode
Deployment
On-Demand Dedicated
Monthly Reserved
Parameters
235B
Activated parameters
22B
Context length
256K
Input price
$0.65 / 1M tokens
Output price
$3.00 / 1M tokens
Input modalities
Text
Output modalities
Text

Released
July 24, 2025
Quantization level
FP8
External link
Provider docs
Category
Chat

Quickstart docs

Deploy model

Qwen3 235B A22B Thinking 2507 FP8

About model

Model card

Applications & use cases