Models / Moonshot AI
LLM
Chat

Kimi K2 Instruct

State-of-the-art mixture-of-experts agentic intelligence model with 1 T parameters, 128K context, and native tool use

About model

Kimi K2 Instruct is a post-trained model for general-purpose chat and agentic experiences, excelling in tool use, reasoning, and autonomous problem-solving. It is designed for drop-in use, providing reflex-grade responses without long thinking. Suitable for researchers and builders seeking a strong foundation for custom solutions.

  • How to use model

    Get started with this model in 10 lines of code! The model ID is moonshotai/Kimi-K2-Instruct and the pricing is $1 for input tokens and $3 for output tokens.

        
          from together import Together
    
          client = Together()
          resp = client.chat.completions.create(
              model="moonshotai/Kimi-K2-Instruct",
              messages=[{"role":"user","content":"Code a hacker news clone"}],
              stream=True,
          )
          for tok in resp:
              print(tok.choices[0].delta.content, end="", flush=True)
        
    
  • Model card

    Architecture Overview:

    • 1 T-parameter MoE with 32 B activated parameters
    • Hybrid MoE sparsity for compute efficiency
    • 128K token context for deep document comprehension
    • Agentic design with native tool usage & CLI integration

    Training Methodology:

    • Pre-trained on 15.5 T tokens using MuonClip optimizer for stability
    • Zero-instability training at large scale

    Performance Characteristics:

    • SOTA on LiveCodeBench v6, AIME 2025, MMLU-Redux, and SWE-bench (agentic)
  • Prompting

    • Use natural language instructions or tool commands
    • Temperature ≈ 0.6: Calibrated to Kimi‑K2‑Instruct’s RLHF alignment curve; higher values yield verbosity.
    • Kimi K2 autonomously invokes tools to fulfill tasks: Pass a JSON schema in tools=[…]; set tool_choice="auto". Kimi decides when/what to call.
    • Supports multi-turn dialogues & chained workflows: Because the model is “agentic”, give a high‑level objective (“Analyse this CSV and write a report”), letting it orchestrate sub‑tasks.
    • Chunk very long contexts: 128 K is huge, but response speed drops on >100 K inputs; supply a short executive brief in the final user message to focus the model.
  • Applications & use cases

    Kimi K2 shines in scenarios requiring autonomous problem-solving – specifically with coding & tool use:

    • Agentic Workflows: Automate multi-step tasks like booking flights, research, or data analysis using tools/APIs
    • Coding & Debugging: Solve software engineering tasks (e.g., SWE-bench), generate patches, or debug code
    • Research & Report Generation: Summarize technical documents, analyze trends, or draft reports using long-context capabilities
    • STEM Problem-Solving: Tackle advanced math (AIME, MATH), logic puzzles (ZebraLogic), or scientific reasoning
    • Tool Integration: Build AI agents that interact with APIs (e.g., weather data, databases).
Related models
  • Model provider
    Moonshot AI
  • Type
    LLM
    Chat
  • Main use cases
    Chat
    Coding Agents
  • Features
    JSON Mode
  • Fine tuning
    Supported
  • Deployment
    Monthly Reserved
  • Parameters
    1 Trillion
  • Activated parameters
    32B
  • Context length
    128K tokens
  • Input modalities
    Text
  • Output modalities
    Text
  • Released
    July 10, 2025
  • Last updated
    July 13, 2025
  • Quantization level
    FP8
  • External link
  • Category
    Chat