Models / Moonshot AI
Vision
Chat
Code
LLM

Kimi K2.7 Code

Coding-focused agentic model with improved long-horizon task completion and thinking efficiency

About model

Kimi K2.7 Code is Moonshot AI's coding-focused agentic model built on the Kimi K2.6 architecture, delivering substantial improvements on real-world long-horizon coding tasks and reducing thinking-token usage by approximately 30% compared to K2.6. It shares the same 1T parameter (32B activated) MoE architecture with MoonViT vision encoder, 256K context window, and native INT4 quantization. Thinking and preserve_thinking are always enabled for consistent reasoning across multi-turn agentic sessions. The model scores 62.0 on Kimi Code Bench V2, 81.1% on MCP Mark Verified, and 76.0% on MCP Atlas.

Kimi Code Bench V2

62

Coding benchmark across 10+ languages and production tech stacks

Fewer Thinking Tokens

30%

vs Kimi K2.6 with improved long-horizon task completion

MCP Mark Verified

81.10%

Tool use across Notion, GitHub, Filesystem, Postgres, and Playwright

Model key capabilities
  • Long-Horizon Coding: Substantial improvements on real-world software engineering tasks across backend services, infrastructure, performance engineering, systems programming, security, and frontend development
  • Thinking Efficiency: Approximately 30% reduction in thinking-token usage vs Kimi K2.6, improving cost efficiency across long agentic coding sessions without sacrificing task completion quality
  • Agentic Tool Use: 81.1% MCP Mark Verified and 76.0% MCP Atlas across real MCP server environments including Notion, GitHub, Filesystem, Postgres, and Playwright
  • Multimodal Input: Text, image, and video input via MoonViT encoder with 256K context and preserve_thinking for consistent multi-turn agentic execution
  • API usage

    • cURL
    • Python
    • Typescript

    Endpoint:

    moonshotai/Kimi-K2.7-Code

    curl -X POST "https://api.together.xyz/v1/chat/completions" \
      -H "Authorization: Bearer $TOGETHER_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "moonshotai/Kimi-K2.7-Code",
        "messages": [
          {
            "role": "user",
            "content": "What are some fun things to do in New York?"
          }
        ]
    }'
    
    from together import Together
    
    client = Together()
    
    response = client.chat.completions.create(
      model="moonshotai/Kimi-K2.7-Code",
      messages=[
        {
          "role": "user",
          "content": "What are some fun things to do in New York?"
        }
      ]
    )
    print(response.choices[0].message.content)
    
    import Together from 'together-ai';
    const together = new Together();
    
    const completion = await together.chat.completions.create({
      model: 'moonshotai/Kimi-K2.7-Code',
      messages: [
        {
          role: 'user',
          content: 'What are some fun things to do in New York?'
         }
      ],
    });
    
    console.log(completion.choices[0].message.content);
    
  • Model card

    Architecture Overview:
    • 1T total parameter MoE with 32B parameters activated per token
    • 384 experts with 8 selected per token plus 1 shared expert; Multi-head Latent Attention (MLA)
    • MoonViT vision encoder (400M parameters) for image and video input
    • 256K token context window
    • Native INT4 quantization
    • Thinking and preserve_thinking always enabled — thinking mode cannot be disabled
    • Same base architecture as Kimi K2.5 and K2.6; improvements come from coding-focused post-training

    Training Methodology:
    • Coding-focused post-training built on Kimi K2.6
    • Targets real-world long-horizon coding tasks: production incidents, open-source projects, and internal engineering use cases
    • Achieves approximately 30% reduction in thinking-token usage vs K2.6

    Performance Characteristics:
    • Coding: 62.0 Kimi Code Bench V2 (+11.1 over K2.6), 53.6 Program Bench, 35.1 MLS-Bench Lite
    • Agentic: 81.1% MCP Mark Verified (+8.3 over K2.6), 76.0% MCP Atlas (+6.6 over K2.6), 46.9 Kimi Claw 24/7 Bench
    • Recommended: temperature=1.0, top_p=0.95, 256K context

  • Prompting

    Together AI API Access:
    • Access Kimi K2.7 Code via Together AI APIs using the endpoint moonshotai/Kimi-K2.7-Code
    • Authenticate using your Together AI API key in request headers
    • Thinking and preserve_thinking are always enabled — reasoning context is retained across all turns automatically
    • Recommended parameters: temperature=1.0, top_p=0.95
    • Available on Together AI serverless and dedicated infrastructure

  • Applications & use cases

    Long-Horizon Software Engineering:
    • End-to-end task completion across backend services, infrastructure, and performance engineering
    • Production incident resolution and open-source project contributions
    • Security engineering, systems programming, and ML/data engineering
    • Multi-file refactors and complex debugging across large codebases

    Agentic Tool Orchestration:
    • MCP tool use across Notion, GitHub, Filesystem, Postgres, and Playwright environments
    • Persistent multi-day coworking tasks spanning software engineering, ML research, and trading
    • Autonomous long-horizon execution with consistent reasoning via preserve_thinking

    Multimodal Coding Workflows:
    • Image and video input for visual coding, UI generation from screenshots, and design-to-code workflows
    • Frontend development from visual references with 256K context for large projects
    • Cross-modal reasoning across code, images, and documentation in a single session

Related models
  • Model provider
    Moonshot AI
  • Type
    Vision
    Chat
    Code
    LLM
  • Main use cases
    Coding Agents
  • Features
    Function Calling
    JSON Mode
  • Deployment
    Serverless
    On-Demand Dedicated
  • Context length
    256K
  • Input price

    $0.95 / 1M tokens

    $0.19 (cached)/1M

  • Output price

    $4.00 / 1M tokens

  • Input modalities
    Text
    Image
    Video
  • Output modalities
    Text
  • Released
    June 12, 2026
  • Last updated
    June 12, 2026
  • Category
    Chat