Models / Moonshot AIKimi / / Kimi K2 Thinking API
Kimi K2 Thinking API
State-of-the-art thinking agent with deep reasoning and tool orchestration

This model is not currently supported on Together AI.
Visit our Models page to view all the latest models.
Kimi K2 Thinking API Usage
Endpoint
How to use Kimi K2 Thinking
Model details
Architecture Overview:
• Mixture-of-Experts (MoE) architecture with 1T total parameters and 32B activated parameters
• 61 total layers including 1 dense layer with 384 experts selecting 8 per token
• Multi-head Latent Attention (MLA) mechanism with 7168 attention hidden dimension
• Native INT4 quantization applied to MoE components through Quantization-Aware Training (QAT)
• 256K context window enabling complex long-horizon agentic tasks
• 160K vocabulary size with SwiGLU activation function
Training Methodology:
• End-to-end trained to interleave chain-of-thought reasoning with function calls
• Quantization-Aware Training (QAT) employed in post-training stage for lossless INT4 inference
• Specialized training for stable long-horizon agency across 200-300 consecutive tool invocations
• Advanced reasoning depth scaling through multi-step test-time computation
• Tool orchestration training enabling autonomous research, coding, and writing workflows
Performance Characteristics:
• State-of-the-art 44.9% on Humanity's Last Exam (HLE) with tools across 100+ expert subjects
• Leading agentic search performance: 60.2% BrowseComp, 62.3% BrowseComp-ZH, 56.3% Seal-0
• Elite mathematical reasoning: 99.1% AIME 2025 (w/ python), 95.1% HMMT 2025 (w/ python), 78.6% IMO-AnswerBench
• Strong coding capabilities: 71.3% SWE-Bench Verified, 61.1% SWE-Bench Multilingual, 83.1% LiveCodeBench v6
• 2x generation speed improvement through native INT4 quantization without performance degradation
• Maintains coherent goal-directed behavior surpassing prior models that degrade after 30-50 steps
Prompting Kimi K2 Thinking
Applications & Use Cases
Agentic Reasoning & Problem Solving:
• Expert-level reasoning across 100+ subjects achieving 44.9% on Humanity's Last Exam with tools
• PhD-level mathematical problem solving through 23+ interleaved reasoning and tool calls
• Elite competition mathematics: 99.1% AIME 2025, 95.1% HMMT 2025 with Python tools
• Dynamic hypothesis generation, evidence verification, and coherent answer construction
Agentic Search & Web Reasoning:
• State-of-the-art 60.2% BrowseComp performance, significantly outperforming 29.2% human baseline
• Continuous browsing, searching, and reasoning over hard-to-find real-world web information
• 200-300 sequential tool calls for deep research workflows without human interference
• Goal-directed web-based reasoning with adaptive hypothesis refinement
• Financial search: 47.4% FinSearchComp-T3, 87.0% Frames benchmark
Agentic Coding & Software Development:
• Production-level coding: 71.3% SWE-Bench Verified, 61.1% SWE-Bench Multilingual, 41.9% Multi-SWE-bench
• Component-heavy frontend development: fully functional HTML, React, and responsive web applications from single prompts
• Multi-step development workflows with precision tool invocation and adaptive reasoning
• Terminal automation: 47.1% Terminal-Bench with simulated tools
• Competitive programming: 83.1% LiveCodeBench v6, 48.7% OJ-Bench (C++)
Creative & Practical Writing:
• Creative writing with vivid imagery, emotional depth, and thematic resonance
• Fiction, cultural reviews, and science fiction with natural fluency and style command
• Academic and research writing with rigorous logic, thoroughness, and substantive richness
• 73.8% Longform Writing benchmark demonstrating instruction adherence and perspective breadth
• Personal and emotional responses with empathy, nuance, and actionable guidance
Long-Horizon Autonomous Workflows:
• Research automation executing hundreds of coherent reasoning steps
• Office automation and document generation workflows
• Multi-step coding projects from ideation to functional products
• Complex problem decomposition into clear, actionable subtasks
• Stable agency surpassing models that degrade after 30-50 steps
