Kimi K2.7 Code
Coding-focused agentic model with improved long-horizon task completion and thinking efficiency
About model
Kimi K2.7 Code is Moonshot AI's coding-focused agentic model built on the Kimi K2.6 architecture, delivering substantial improvements on real-world long-horizon coding tasks and reducing thinking-token usage by approximately 30% compared to K2.6. It shares the same 1T parameter (32B activated) MoE architecture with MoonViT vision encoder, 256K context window, and native INT4 quantization. Thinking and preserve_thinking are always enabled for consistent reasoning across multi-turn agentic sessions. The model scores 62.0 on Kimi Code Bench V2, 81.1% on MCP Mark Verified, and 76.0% on MCP Atlas.
62
Coding benchmark across 10+ languages and production tech stacks
30%
vs Kimi K2.6 with improved long-horizon task completion
81.10%
Tool use across Notion, GitHub, Filesystem, Postgres, and Playwright
- Long-Horizon Coding: Substantial improvements on real-world software engineering tasks across backend services, infrastructure, performance engineering, systems programming, security, and frontend development
- Thinking Efficiency: Approximately 30% reduction in thinking-token usage vs Kimi K2.6, improving cost efficiency across long agentic coding sessions without sacrificing task completion quality
- Agentic Tool Use: 81.1% MCP Mark Verified and 76.0% MCP Atlas across real MCP server environments including Notion, GitHub, Filesystem, Postgres, and Playwright
- Multimodal Input: Text, image, and video input via MoonViT encoder with 256K context and preserve_thinking for consistent multi-turn agentic execution
API usage
Endpoint:
Model card
Architecture Overview:
• 1T total parameter MoE with 32B parameters activated per token
• 384 experts with 8 selected per token plus 1 shared expert; Multi-head Latent Attention (MLA)
• MoonViT vision encoder (400M parameters) for image and video input
• 256K token context window
• Native INT4 quantization
• Thinking and preserve_thinking always enabled — thinking mode cannot be disabled
• Same base architecture as Kimi K2.5 and K2.6; improvements come from coding-focused post-training
Training Methodology:
• Coding-focused post-training built on Kimi K2.6
• Targets real-world long-horizon coding tasks: production incidents, open-source projects, and internal engineering use cases
• Achieves approximately 30% reduction in thinking-token usage vs K2.6
Performance Characteristics:
• Coding: 62.0 Kimi Code Bench V2 (+11.1 over K2.6), 53.6 Program Bench, 35.1 MLS-Bench Lite
• Agentic: 81.1% MCP Mark Verified (+8.3 over K2.6), 76.0% MCP Atlas (+6.6 over K2.6), 46.9 Kimi Claw 24/7 Bench
• Recommended: temperature=1.0, top_p=0.95, 256K context
Prompting
Together AI API Access:
• Access Kimi K2.7 Code via Together AI APIs using the endpoint moonshotai/Kimi-K2.7-Code
• Authenticate using your Together AI API key in request headers
• Thinking and preserve_thinking are always enabled — reasoning context is retained across all turns automatically
• Recommended parameters: temperature=1.0, top_p=0.95
• Available on Together AI serverless and dedicated infrastructure
Applications & use cases
Long-Horizon Software Engineering:
• End-to-end task completion across backend services, infrastructure, and performance engineering
• Production incident resolution and open-source project contributions
• Security engineering, systems programming, and ML/data engineering
• Multi-file refactors and complex debugging across large codebases
Agentic Tool Orchestration:
• MCP tool use across Notion, GitHub, Filesystem, Postgres, and Playwright environments
• Persistent multi-day coworking tasks spanning software engineering, ML research, and trading
• Autonomous long-horizon execution with consistent reasoning via preserve_thinking
Multimodal Coding Workflows:
• Image and video input for visual coding, UI generation from screenshots, and design-to-code workflows
• Frontend development from visual references with 256K context for large projects
• Cross-modal reasoning across code, images, and documentation in a single session
- Model providerMoonshot AI
- TypeVisionChatCodeLLM
- Main use casesCoding Agents
- FeaturesFunction CallingJSON Mode
- DeploymentServerlessOn-Demand Dedicated
- Endpoint
- Context length256K
- Input price
$0.95 / 1M tokens
$0.19 (cached)/1M
- Output price
$4.00 / 1M tokens
- Input modalitiesTextImageVideo
- Output modalitiesText
- ReleasedJune 12, 2026
- Last updatedJune 12, 2026
- CategoryChat