Kimi K2 Instruct
State-of-the-art mixture-of-experts agentic intelligence model with 1 T parameters, 128K context, and native tool use
About model
Kimi K2 Instruct is a post-trained model for general-purpose chat and agentic experiences, excelling in tool use, reasoning, and autonomous problem-solving. It is designed for drop-in use, providing reflex-grade responses without long thinking. Suitable for researchers and builders seeking a strong foundation for custom solutions.
API usage
Endpoint:
How to use model
Get started with this model in 10 lines of code! The model ID is
moonshotai/Kimi-K2-Instructand the pricing is $1 for input tokens and $3 for output tokens.Model card
Architecture Overview:
- 1 T-parameter MoE with 32 B activated parameters
- Hybrid MoE sparsity for compute efficiency
- 128K token context for deep document comprehension
- Agentic design with native tool usage & CLI integration
Training Methodology:
- Pre-trained on 15.5 T tokens using MuonClip optimizer for stability
- Zero-instability training at large scale
Performance Characteristics:
- SOTA on LiveCodeBench v6, AIME 2025, MMLU-Redux, and SWE-bench (agentic)
Prompting
- Use natural language instructions or tool commands
- Temperature ≈ 0.6: Calibrated to Kimi‑K2‑Instruct’s RLHF alignment curve; higher values yield verbosity.
- Kimi K2 autonomously invokes tools to fulfill tasks: Pass a JSON schema in
tools=[…]; settool_choice="auto". Kimi decides when/what to call. - Supports multi-turn dialogues & chained workflows: Because the model is “agentic”, give a high‑level objective (“Analyse this CSV and write a report”), letting it orchestrate sub‑tasks.
- Chunk very long contexts: 128 K is huge, but response speed drops on >100 K inputs; supply a short executive brief in the final user message to focus the model.
Applications & use cases
Kimi K2 shines in scenarios requiring autonomous problem-solving – specifically with coding & tool use:
- Agentic Workflows: Automate multi-step tasks like booking flights, research, or data analysis using tools/APIs
- Coding & Debugging: Solve software engineering tasks (e.g., SWE-bench), generate patches, or debug code
- Research & Report Generation: Summarize technical documents, analyze trends, or draft reports using long-context capabilities
- STEM Problem-Solving: Tackle advanced math (AIME, MATH), logic puzzles (ZebraLogic), or scientific reasoning
- Tool Integration: Build AI agents that interact with APIs (e.g., weather data, databases).
- Model providerMoonshot AI
- TypeLLM
- Main use casesChatCoding Agents
- FeaturesJSON Mode
- Fine tuningSupported
- DeploymentServerlessOn-Demand Dedicated
- Endpoint
- Parameters1 Trillion
- Activated parameters32B
- Context length128K tokens
- Input price
$1.00 / 1M tokens
- Output price
$3.00 / 1M tokens
- Input modalitiesText
- Output modalitiesText
- ReleasedJuly 10, 2025
- Last updatedJuly 13, 2025
- Quantization levelFP8
- External link
- CategoryChat