Kimi K2 Instruct

State-of-the-art mixture-of-experts agentic intelligence model with 1 T parameters, 128K context, and native tool use

About model

Kimi K2 Instruct is a post-trained model for general-purpose chat and agentic experiences, excelling in tool use, reasoning, and autonomous problem-solving. It is designed for drop-in use, providing reflex-grade responses without long thinking. Suitable for researchers and builders seeking a strong foundation for custom solutions.

Quickstart guides

Apps

How to Build a Lovable Clone with Kimi K2

Agents

Agent Workflows

RAG

Building a RAG Workflow

How to use model

Get started with this model in 10 lines of code! The model ID is moonshotai/Kimi-K2-Instruct and the pricing is $1 for input tokens and $3 for output tokens.

    
      from together import Together

      client = Together()
      resp = client.chat.completions.create(
          model="moonshotai/Kimi-K2-Instruct",
          messages=[{"role":"user","content":"Code a hacker news clone"}],
          stream=True,
      )
      for tok in resp:
          print(tok.choices[0].delta.content, end="", flush=True)

Model card
Architecture Overview:
‍‍1 T-parameter MoE with 32 B activated parameters
Hybrid MoE sparsity for compute efficiency
128K token context for deep document comprehension
Agentic design with native tool usage & CLI integration‍
Training Methodology:
‍‍Pre-trained on 15.5 T tokens using MuonClip optimizer for stability
Zero-instability training at large scale
‍Performance Characteristics:‍
SOTA on LiveCodeBench v6, AIME 2025, MMLU-Redux, and SWE-bench (agentic)
Prompting
Use natural language instructions or tool commands
Temperature ≈ 0.6: Calibrated to Kimi‑K2‑Instruct’s RLHF alignment curve; higher values yield verbosity.
Kimi K2 autonomously invokes tools to fulfill tasks: Pass a JSON schema in tools=[…]; set tool_choice="auto". Kimi decides when/what to call.
Supports multi-turn dialogues & chained workflows: Because the model is “agentic”, give a high‑level objective (“Analyse this CSV and write a report”), letting it orchestrate sub‑tasks.
Chunk very long contexts: 128 K is huge, but response speed drops on >100 K inputs; supply a short executive brief in the final user message to focus the model.
Applications & use cases
Kimi K2 shines in scenarios requiring autonomous problem-solving – specifically with coding & tool use:
Agentic Workflows: Automate multi-step tasks like booking flights, research, or data analysis using tools/APIs
Coding & Debugging: Solve software engineering tasks (e.g., SWE-bench), generate patches, or debug code
Research & Report Generation: Summarize technical documents, analyze trends, or draft reports using long-context capabilities
STEM Problem-Solving: Tackle advanced math (AIME, MATH), logic puzzles (ZebraLogic), or scientific reasoning
Tool Integration: Build AI agents that interact with APIs (e.g., weather data, databases).

Related models

Model specifications

Model data

Model provider
Moonshot AI
Type
LLM
Chat
Main use cases
Chat
Coding Agents
Features
JSON Mode
Fine tuning
Supported
Deployment
Monthly Reserved
Parameters
1 Trillion
Activated parameters
32B
Context length
128K tokens
Input modalities
Text
Output modalities
Text

Released
July 10, 2025
Last updated
July 13, 2025
Quantization level
FP8
External link
Provider docs
Category
Chat

Quickstart docs

Deploy model

Kimi K2 Instruct

About model

How to use model

Model card

Prompting

Applications & use cases