Models / ZAI
Chat
Reasoning
LLM

GLM-5

Best-in-class open-source model for systems engineering and long-horizon agents

About model

GLM-5 is an open-source foundation model built for complex systems engineering and long-horizon agent workflows. It delivers production-grade productivity in large-scale programming tasks, with performance aligned with top closed-source models, and is designed for expert developers who think and build at the system level. Purpose-built for multi-stage, long-step complex tasks, GLM-5 autonomously decomposes system-level requirements with an architect-level approach while maintaining context coherence across automated workflows that run for hours.

Total Parameters (40B activated)

744B

MoE architecture for complex systems engineering

SWE-Bench Verified

77.8%

Best-in-class open-source coding

Vending Bench 2 (Open-Source)

#1

Long-horizon planning and resource management

Model key capabilities
  • Agentic Long-Horizon Planning: Autonomously decomposes system-level requirements with architect-level approach, maintaining context coherence across workflows running for hours
  • Deep Debugging & Self-Correction: Analyzes logs, identifies root causes, and iteratively fixes compilation or runtime failures until the system runs end-to-end
  • Backend Architecture Excellence: Strong depth reasoning in backend architecture design, complex algorithm implementation, and difficult bug resolution
  • Opus-Level Intelligence: Benchmarks against Claude Opus 4.6 in code logic density and systems engineering capability with open-source flexibility and cost efficiency
Performance benchmarks

Model

AIME 2025

GPQA Diamond

HLE

LiveCodeBench

MATH500

SWE-bench verified

86.0%

77.8%

Related open-source models

Competitor closed-source models

Claude Opus 4.6

90.5%

34.2%

78.7%

OpenAI o3

83.3%

24.9%

99.2%

62.3%

OpenAI o1

76.8%

96.4%

48.9%

GPT-4o

49.2%

2.7%

32.3%

89.3%

31.0%

  • API usage

    • cURL
    • Python
    • Typescript

    Endpoint:

    zai-org/GLM-5

    curl -X POST "https://api.together.xyz/v1/chat/completions" \
      -H "Authorization: Bearer $TOGETHER_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "zai-org/GLM-5",
        "messages": [
          {
            "role": "user",
            "content": "What are some fun things to do in New York?"
          }
        ]
    }'
    
    from together import Together
    
    client = Together()
    
    response = client.chat.completions.create(
      model="zai-org/GLM-5",
      messages=[
        {
          "role": "user",
          "content": "What are some fun things to do in New York?"
        }
      ]
    )
    print(response.choices[0].message.content)
    
    import Together from 'together-ai';
    const together = new Together();
    
    const completion = await together.chat.completions.create({
      model: 'zai-org/GLM-5',
      messages: [
        {
          role: 'user',
          content: 'What are some fun things to do in New York?'
         }
      ],
    });
    
    console.log(completion.choices[0].message.content);
    
  • Model card

    Architecture Overview:
    • Open-source foundation model built for complex systems engineering and long-horizon agent workflows
    • Designed for expert developers who think and build at the system level
    • FP8 quantization for efficient inference
    • Performance aligned with top closed-source models like Claude Opus 4.6
    • Purpose-built for multi-stage, long-step complex tasks requiring deep reasoning

    Training Methodology:
    • Trained on large-scale programming tasks and system-level engineering workflows
    • Emphasis on deep system construction and long-range agent execution
    • Optimized for backend architecture design and complex algorithm implementation
    • Self-reflection and error correction training for iterative debugging
    • Architect-level decomposition of system requirements

    Performance Characteristics:
    • Agentic Long-Horizon Planning: Autonomously decomposes system-level requirements with architect-level approach
    • Context Coherence: Maintains goal alignment across automated workflows running for hours
    • Backend Refactoring: Strong depth reasoning in backend architecture design and complex algorithm implementation
    • Deep Debugging: Analyzes logs, identifies root causes, iteratively fixes compilation/runtime failures
    • Self-Correction: Robust error correction mechanisms ensuring end-to-end system execution
    • Opus-Level Intelligence: Code logic density and systems engineering capability benchmarking against Claude Opus 4.6
    • Open-Source Flexibility: Production deployment options with strong cost efficiency

  • Applications & use cases

    Large-Scale Systems Engineering:
    • Complex backend architecture design and refactoring
    • Multi-service system construction with deep integration requirements
    • Large codebase modernization and migration projects
    • Distributed systems design and implementation
    • Microservices architecture planning and execution
    • System-level performance optimization and scaling

    Long-Horizon Agent Workflows:
    • Automated workflows running for hours with maintained goal alignment
    • Multi-stage deployment pipelines with autonomous error recovery
    • Complex CI/CD orchestration with self-correction mechanisms
    • End-to-end system automation requiring deep reasoning
    • Infrastructure-as-code generation and management

    Deep Debugging & Error Resolution:
    • Root cause analysis of complex compilation failures
    • Runtime error diagnosis across distributed systems
    • Log analysis and issue identification in large-scale applications
    • Iterative debugging with self-reflection mechanisms
    • Production incident resolution requiring system-level understanding

    Backend Architecture & Algorithms:
    • Complex algorithm implementation and optimization
    • Database schema design and query optimization
    • API design and backend service architecture
    • Performance-critical code generation and refactoring
    • System bottleneck identification and resolution

    Expert Developer Workflows:
    • Architect-level system decomposition and planning
    • Technical debt reduction in enterprise codebases
    • Legacy system modernization with maintained functionality
    • Code review and architectural consultation
    • System design documentation and technical specifications

    Open-Source Alternative Deployment:
    • Teams requiring Opus-level intelligence with open-source flexibility
    • Cost-efficient production deployment for systems engineering tasks
    • On-premises or private cloud deployment with full model control
    • Custom fine-tuning for domain-specific systems engineering
    • Research and development requiring model transparency

Related models
  • Model provider
    ZAI
  • Type
    Chat
    Reasoning
    LLM
  • Main use cases
    Chat
    Function Calling
  • Features
    Function Calling
    JSON Mode
  • Speed
    High
  • Intelligence
    Very High
  • Deployment
    Serverless
    Monthly Reserved
  • Parameters
    744B
  • Context length
    202K
  • Input price

    $1.00 / 1M tokens

  • Output price

    $3.20 / 1M tokens

  • Input modalities
    Text
  • Output modalities
    Text
  • Released
    February 10, 2026
  • Last updated
    February 12, 2026
  • Quantization level
    FP4
  • External link
  • Category
    Chat