Models / ZAIGLM / / GLM-4.5-Air API
GLM-4.5-Air API
106B‑parameter efficient MoE model, 128K‑token context, hybrid reasoning modes, optimized for superior efficiency while maintaining competitive performance.

This model is not currently supported on Together AI.
Visit our Models page to view all the latest models.
GLM-4.5-Air delivers competitive AI performance with 106B parameters and 12B activation, offering the same 128K context and hybrid reasoning capabilities as GLM-4.5 but optimized for efficiency. Perfect for cost-conscious deployments requiring sophisticated AI capabilities.
GLM-4.5-Air API Usage
Endpoint
How to use GLM-4.5-Air
Model details
Architecture Overview:
• Compact Mixture-of-Experts design with 106B total parameters and 12B active parameters
• 128K token context window matching full GLM-4.5 capabilities
• Optimized MoE routing with reduced width and increased depth for efficiency
• Grouped-Query Attention with Multi-Token Prediction layer support
Training Methodology:
• Shared training pipeline with GLM-4.5 using 15T general + 7T code & reasoning tokens
• Specialized post-training for efficiency-performance balance
• Reinforcement learning optimization for agentic task performance
• FP8 and BF16 mixed precision training for accelerated inference
Performance Characteristics:
• Ranked 6th overall with 59.8 score demonstrating competitive efficiency
• Strong agentic performance with 69.4 on τ-bench and 76.4 on BFCL-v3
• Solid coding capabilities with 57.6% on SWE-bench Verified
• Optimal efficiency on performance-scale trade-off boundary
Prompting GLM-4.5-Air
Applications & Use Cases
Enterprise Applications:
• Cost-effective conversational AI for high-volume deployments
• Efficient intelligent agents for standard automation tasks
• Resource-conscious development environments and coding assistance
• Scalable customer support and virtual assistant implementations
Development & Technical:
• Lightweight coding assistance and software development support
• Efficient reasoning for educational and training applications
• Streamlined tool integration for standard agentic workflows
• Multi-language processing for global accessibility requirements
Business Solutions:
• SME and startup-friendly AI integration with competitive performance
• Batch processing and automated content generation at scale
• Mobile and edge deployment scenarios requiring efficiency
• Proof-of-concept and prototyping for AI-powered applications