Models / Moonshot AI
LLM

Kimi K2 Instruct-0905

State-of-the-art mixture-of-experts agentic intelligence model with 1 T parameters, 256K context, and native tool use

About model

Kimi K2-Instruct-0905 is a state-of-the-art mixture-of-experts language model with 32 billion activated parameters, excelling in agentic coding intelligence and frontend coding experience, suitable for developers and coding tasks.

Performance benchmarks

Model

AIME 2025

GPQA Diamond

HLE

LiveCodeBench

MATH500

SWE-bench verified

69.2%

Related open-source models

Competitor closed-source models

OpenAI o3

83.3%

24.9%

99.2%

62.3%

OpenAI o1

76.8%

96.4%

48.9%

GPT-4o

49.2%

2.7%

32.3%

89.3%

31.0%

Gemini 2.0 Flash

64.1%

34.5%

93.0%

  • API usage

    • cURL
    • Python
    • Typescript

    Endpoint:

    moonshotai/Kimi-K2-Instruct-0905

    curl -X POST "https://api.together.xyz/v1/chat/completions" \
      -H "Authorization: Bearer $TOGETHER_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "moonshotai/Kimi-K2-Instruct-0905",
        "messages": [
          {
            "role": "user",
            "content": "What are some fun things to do in New York?"
          }
        ]
    }'
    
    from together import Together
    
    client = Together()
    
    response = client.chat.completions.create(
      model="moonshotai/Kimi-K2-Instruct-0905",
      messages=[
        {
          "role": "user",
          "content": "What are some fun things to do in New York?"
        }
      ]
    )
    print(response.choices[0].message.content)
    
    import Together from 'together-ai';
    const together = new Together();
    
    const completion = await together.chat.completions.create({
      model: 'moonshotai/Kimi-K2-Instruct-0905',
      messages: [
        {
          role: 'user',
          content: 'What are some fun things to do in New York?'
         }
      ],
    });
    
    console.log(completion.choices[0].message.content);
    
  • How to use model

    Get started with this model in 10 lines of code! The model ID is moonshotai/Kimi-K2-Instruct-0905 and the pricing is $1 for input tokens and $3 for output tokens.

        
          from together import Together
    
          client = Together()
          resp = client.chat.completions.create(
              model="moonshotai/Kimi-K2-Instruct",
              messages=[{"role":"user","content":"Code a hacker news clone"}],
              stream=True,
          )
          for tok in resp:
              print(tok.choices[0].delta.content, end="", flush=True)
        
    
  • Model card

    Architecture Overview:

    • 1 T-parameter MoE with 32 B activated parameters
    • Hybrid MoE sparsity for compute efficiency
    • 256K token context for deep document comprehension
    • Agentic design with native tool usage & CLI integration

    Training Methodology:

    • Pre-trained on 15.5 T tokens using MuonClip optimizer for stability
    • Zero-instability training at large scale

    Performance Characteristics:

    • SOTA on LiveCodeBench v6, AIME 2025, MMLU-Redux, and SWE-bench (agentic)
  • Prompting

    • Use natural language instructions or tool commands
    • Temperature ≈ 0.6: Calibrated to Kimi‑K2‑Instruct’s RLHF alignment curve; higher values yield verbosity.
    • Kimi K2 autonomously invokes tools to fulfill tasks: Pass a JSON schema in tools=[…]; set tool_choice="auto". Kimi decides when/what to call.
    • Supports multi-turn dialogues & chained workflows: Because the model is “agentic”, give a high‑level objective (“Analyse this CSV and write a report”), letting it orchestrate sub‑tasks.
  • Applications & use cases

    Kimi K2 shines in scenarios requiring autonomous problem-solving – specifically with coding & tool use:

    • Agentic Workflows: Automate multi-step tasks like booking flights, research, or data analysis using tools/APIs
    • Coding & Debugging: Solve software engineering tasks (e.g., SWE-bench), generate patches, or debug code
    • Research & Report Generation: Summarize technical documents, analyze trends, or draft reports using long-context capabilities
    • STEM Problem-Solving: Tackle advanced math (AIME, MATH), logic puzzles (ZebraLogic), or scientific reasoning
    • Tool Integration: Build AI agents that interact with APIs (e.g., weather data, databases).
Related models
  • Model provider
    Moonshot AI
  • Type
    LLM
  • Main use cases
    Chat
    Function Calling
  • Features
    Function Calling
  • Speed
    Medium
  • Intelligence
    Very High
  • Deployment
    Serverless
  • Parameters
    1.0T
  • Context length
    262K
  • Input price

    $1.00 / 1M tokens

  • Output price

    $3.00 / 1M tokens

  • Input modalities
    Text
  • Output modalities
    Text
  • Released
    September 2, 2025
  • Last updated
    September 4, 2025
  • Quantization level
    FP8
  • External link
  • Category
    Chat