GLM-4.7

Advanced agentic coding and reasoning with instant serverless access on Together AI

Try now

read docs

About model

Frontier Agentic Coding:
GLM-4.7 is Z.AI's latest flagship model engineered for task-oriented development and complex agent workflows. Ranking #1 open-source on LMArena Code Arena and achieving 73.8% on SWE-bench Verified, it delivers enhanced agentic coding, superior frontend aesthetics, and stable multi-step reasoning through advanced interleaved thinking. Access instantly via Together AI serverless APIs for rapid prototyping, evaluation, and production deployment with 200K context and 128K max output.

‍

Quickstart guides

RAG

Building a RAG Workflow

Agents

Agent Workflows

Apps

Next.js Chat Quickstart

Performance benchmarks

Model	AIME 2025	GPQA Diamond	HLE	LiveCodeBench	MATH500	SWE-bench verified
GLM-4.7	95.7%	83.3%
Related open-source models
Competitor closed-source models
Claude Opus 4.6		90.5%	34.2%			78.7%
OpenAI o3		83.3%	24.9%		99.2%	62.3%
OpenAI o1		76.8%			96.4%	48.9%
GPT-4o		49.2%	2.7%	32.3%	89.3%	31.0%

Model card
Core Coding Capabilities:
• #1 open-source model on LMArena Code Arena (outperforming GPT-5.2)
• 73.8% on SWE-bench Verified (+5.8% over GLM-4.6)
• 66.7% on SWE-bench Multilingual (+12.9% improvement)
• 84.9% on LiveCodeBench-v6 (surpassing Claude Sonnet 4.5 at 64%)
• 41% on Terminal Bench 2.0 (+16.5% improvement)

Advanced Reasoning:
• AIME 2025: 95.7% (open-source SOTA)
• GPQA-Diamond: 85.7%
• HLE with Tools: 42.8% (+12.4% over GLM-4.6)
• HMMT Feb 2025: 97.1%
• MMLU-Pro: 84.3%

Agent & Tool Capabilities:
• τ²-Bench: 87.4% (approaching Claude Sonnet 4.5 at 87.2%)
• BrowseComp: 52% (67.5% with context management)
• Significantly improved tool-calling and web browsing performance

Interleaved Thinking System:
• Interleaved Thinking: Thinks before every response and tool call for improved instruction following
• Preserved Thinking: Automatically retains thinking blocks across multi-turn conversations, reducing information loss
• Turn-level Thinking: Per-turn control over reasoning—disable for lightweight requests, enable for complex tasks
• Optimal for long-horizon, complex agentic workflows

Vibe Coding:
• Enhanced UI quality with cleaner, more modern webpages
• Better-looking slides with accurate layout and sizing
• Improved frontend aesthetic generation for low-code platforms and rapid prototyping

Architecture:
• 200K context window with 128K maximum output tokens
• MIT licensed, open weights
• Supports vLLM, SGLang, and Transformers inference frameworks
‍
Applications & use cases
Agentic Coding & Development:
• End-to-end software engineering from requirement comprehension to executable code
• Multilingual agentic coding across multiple programming languages
• Terminal-based development tasks with stable multi-step execution

Frontend & UI Generation:
• Production-quality web UI generation with enhanced aesthetics
• Modern webpage creation with clean layouts and accurate sizing
• Slide and poster generation with professional design consistency
• Low-code platforms and AI frontend generation tools

Complex Agent Workflows:
• Long-horizon tasks requiring preserved reasoning across turns
• Web browsing and information retrieval with context management
• Tool-calling and function execution for enterprise automation
• Real-world interaction scenarios (τ²-Bench: retail, telecom, airline)

Enterprise Applications:
• Development support and solution discussions with context-aware responses
• Decision-making assistance with advanced reasoning capabilities
• Mathematical problem-solving and scientific reasoning
• Creative writing, role-play, and conversational AI with improved quality

Research & Prototyping:
• Rapid prototyping with superior UI aesthetics and accurate implementation
• Complex demos and proof-of-concept development
• Educational tools and interactive learning applications with 200K context support
‍

Related models

Model specifications

Model data

Model provider
ZAI
Type
Chat
Reasoning
LLM
Main use cases
Chat
Fine tuning
Supported
Deployment
Monthly Reserved
On-Demand Dedicated
Context length
202K
Input price
$0.45 / 1M tokens
Output price
$2.00 / 1M tokens
Input modalities
Text
Output modalities
Text

Released
December 21, 2025
Last updated
January 8, 2026
Quantization level
FP8
External link
Provider docs
Category
Chat

Quickstart docs

Deploy model

GLM-4.7

About model

Model card

Applications & use cases