Qwen3 235B A22B FP8 Throughput
Hybrid instruct + reasoning model (232Bx22B MoE) optimized for high-throughput, cost-efficient inference and distillation.
About model
Qwen3-235B-A22B-FP8 Throughput delivers groundbreaking advancements in reasoning, instruction-following, and multilingual support, with seamless switching between thinking and non-thinking modes. It excels in creative writing, role-playing, and complex agent-based tasks, supporting 100+ languages and dialects. Ideal for developers and researchers seeking optimal performance across various scenarios.
Model | AIME 2025 | GPQA Diamond | HLE | LiveCodeBench | MATH500 | SWE-bench verified |
|---|---|---|---|---|---|---|
Qwen3 235B A22B FP8 Throughput | 70.7% | 65.9% | Related open-source models | Competitor closed-source models | ||
90.5% | 34.2% | 78.7% | ||||
83.3% | 24.9% | 99.2% | 62.3% | |||
76.8% | 96.4% | 48.9% | ||||
49.2% | 2.7% | 32.3% | 89.3% | 31.0% |
- TypeChatReasoning
- Main use casesChatReasoningMedium General PurposeFunction Calling
- FeaturesFunction Calling
- DeploymentOn-Demand DedicatedMonthly Reserved
- Parameters235.1B
- Context length40k
- Input price
$0.20 / 1M tokens
- Output price
$0.60 / 1M tokens
- Input modalitiesText
- Output modalitiesText
- ReleasedApril 28, 2025
- Last updatedFebruary 5, 2026
- Quantization levelFP8
- External link
- CategoryChat