Models / Qwen
Chat
Reasoning

Qwen3 235B A22B FP8 Throughput

Hybrid instruct + reasoning model (232Bx22B MoE) optimized for high-throughput, cost-efficient inference and distillation.

About model

Qwen3-235B-A22B-FP8 Throughput delivers groundbreaking advancements in reasoning, instruction-following, and multilingual support, with seamless switching between thinking and non-thinking modes. It excels in creative writing, role-playing, and complex agent-based tasks, supporting 100+ languages and dialects. Ideal for developers and researchers seeking optimal performance across various scenarios.

Performance benchmarks

Model

AIME 2025

GPQA Diamond

HLE

LiveCodeBench

MATH500

SWE-bench verified

70.7%

65.9%

Related open-source models

Competitor closed-source models

Claude Opus 4.6

90.5%

34.2%

78.7%

OpenAI o3

83.3%

24.9%

99.2%

62.3%

OpenAI o1

76.8%

96.4%

48.9%

GPT-4o

49.2%

2.7%

32.3%

89.3%

31.0%

    Related models
    • Model provider
      Qwen
    • Type
      Chat
      Reasoning
    • Main use cases
      Chat
      Reasoning
      Medium General Purpose
      Function Calling
    • Features
      Function Calling
    • Deployment
      On-Demand Dedicated
      Monthly Reserved
    • Parameters
      235.1B
    • Context length
      40k
    • Input price

      $0.20 / 1M tokens

    • Output price

      $0.60 / 1M tokens

    • Input modalities
      Text
    • Output modalities
      Text
    • Released
      April 28, 2025
    • Last updated
      February 5, 2026
    • Quantization level
      FP8
    • External link
    • Category
      Chat