GLM-4.6

Advanced agentic AI with superior coding and reasoning capabilities

Try Now

read docs

About model

GLM-4.6 is the latest flagship model from Z.ai's GLM series, delivering state-of-the-art agentic and coding capabilities that rival Claude Sonnet 4. With 357B parameters in a Mixture-of-Experts architecture, an expanded 200K context window, and 30% improved token efficiency, GLM-4.6 represents the top-performing model developed in China.

Win Rate vs Claude Sonnet 4

48.6%

Real-world coding tasks (CC-Bench)

Context Window

200K

Extended from 128K for complex agentic tasks

More Token Efficient

30%

Compared to GLM-4.5 for equivalent tasks

Model key capabilities

Advanced Agentic Reasoning: Competitive with Claude Sonnet 4 across 8 authoritative benchmarks (AIME, GPQA, LCB, HLE, SWE-bench)
Enhanced Tool Use: Native function calling with autonomous planning and cross-tool collaboration
Refined Writing & Translation: Human-aligned content creation and optimized multilingual capabilities

Quickstart guides

RAG

Building a RAG Workflow

Agents

Agent Workflows

Apps

Next.js Chat Quickstart

Performance benchmarks

Model	AIME 2025	GPQA Diamond	HLE	LiveCodeBench	MATH500	SWE-bench verified
GLM-4.6	93.9%	81.0%	17.2%	82.8%		68.0%
Related open-source models
Competitor closed-source models
Claude Opus 4.6		90.5%	34.2%			78.7%
OpenAI o3		83.3%	24.9%		99.2%	62.3%
OpenAI o1		76.8%			96.4%	48.9%
GPT-4o		49.2%	2.7%	32.3%	89.3%	31.0%

Model card
Architecture Overview:
• Mixture-of-Experts (MoE) architecture with 357B total parameters optimized for efficient inference
• Extended context window from 128K to 200K tokens enabling complex agentic task handling
• Advanced attention mechanisms supporting multi-turn conversations and long-form content generation
• Optimized token efficiency achieving 30% reduction in consumption compared to GLM-4.5

Training Methodology:
• Trained on diverse multilingual datasets with emphasis on code, reasoning, and conversational data
• Enhanced alignment training for human preference matching in writing style and readability
• Specialized training for tool-use capabilities and agentic behavior
• Reinforcement learning from human feedback (RLHF) for improved instruction following

Performance Characteristics:
• Competitive performance with Claude Sonnet 4 across 8 authoritative benchmarks (AIME 25, GPQA, LCB v6, HLE, SWE-Bench Verified)
• 48.6% win rate against Claude Sonnet 4 in real-world coding tasks (CC-Bench evaluation)
• Superior aesthetics and logical layout in frontend code generation
• Enhanced translation quality for minor languages (French, Russian, Japanese, Korean)
• Top-performing model developed in China with state-of-the-art domestic capabilities
‍
Prompting
Conversation Format:
• Multi-turn conversation support with full context retention across 200K tokens
• System message configuration for role definition and behavior customization
• Streaming and non-streaming response modes available
• Thinking mode with tool-use capabilities during inference

Advanced Techniques:
• Recommended temperature: 1.0 for general tasks
• Code-related tasks: top_p=0.95, top_k=40 for optimal results
• Tool-integrated reasoning with native function calling support
• Search-based agent capabilities with specialized toolcall formatting
• Maximum output tokens: 128K for extended generation tasks

Optimization Strategies:
• 15% more token-efficient than GLM-4.5 for equivalent task completion
• Native support for autonomous planning and tool invocation in agentic workflows
• Enhanced task decomposition and cross-tool collaboration capabilities
• Dynamic adjustment support for complex development and office automation workflows
‍
Applications & use cases
AI Coding & Development:
• Superior performance in Python, JavaScript, and Java with aesthetically advanced frontend code generation
• Real-world coding excellence demonstrated across 74 CC-Bench evaluation tasks
• Native integration with popular coding assistants and agent frameworks
• Enhanced debugging, testing, and algorithm implementation capabilities

Agentic Applications:
• Complex multi-step task execution with autonomous planning and tool invocation
• Search-based agents with enhanced user intent understanding and result integration
• Office automation including PowerPoint creation with aesthetically advanced layouts
• Deep Research scenarios with comprehensive information synthesis

Smart Office & Automation:
• High-quality presentation generation with clear logical structures
• Document creation maintaining content integrity and expression accuracy
• Cross-tool collaboration for complex development and office workflows
• Ideal for AI presentation tools and office automation systems

Translation & Multilingual Content:
• Optimized translation for French, Russian, Japanese, Korean and informal contexts
• Semantic coherence and stylistic consistency in lengthy passages
• Superior style adaptation and localized expression for global enterprises
• Suitable for social media, e-commerce content, and cross-border services

Content Creation & Virtual Characters:
• Diverse content production including novels, scripts, and copywriting
• Natural expression through contextual expansion and emotional regulation
• Consistent tone and behavior across multi-turn conversations
• Ideal for virtual humans, social AI, and brand personification operations
‍

Related models

Model specifications

Model data

Model provider
ZAI
Type
Chat
Reasoning
LLM
Main use cases
Chat
Fine tuning
Supported
Parameters
357B
Context length
198K
Input price
$0.60 / 1M tokens
Output price
$2.20 / 1M tokens
Input modalities
Text
Output modalities
Text

Released
September 29, 2025
Last updated
January 28, 2026
Quantization level
BF16
External link
Provider docs
Category
Chat

Quickstart docs

Deploy model

GLM-4.6

About model

Model card

Prompting

Applications & use cases