Models / Moonshot AIKimi /  / Kimi K2.5 API

Kimi K2.5 API

State-of-the-art multimodal thinking agent with vision and Agent Swarm

Try Now
New

This model is not currently supported on Together AI.

Visit our Models page to view all the latest models.

Introducing Kimi K2.5

Kimi K2 Thinking is Moonshot AI's most capable open-source thinking model, built as a thinking agent that reasons step-by-step while dynamically invoking tools. Setting new state-of-the-art records on Humanity's Last Exam (HLE), BrowseComp, and other benchmarks, K2 Thinking dramatically scales multi-step reasoning depth while maintaining stable tool-use across 200–300 sequential calls — a breakthrough in long-horizon agency with native INT4 quantization for 2x inference speed.

50.2%
Humanity's Last Exam (w/ tools)
Expert-level multimodal reasoning across 100+ subjects
15T
Tokens (Mixed Visual & Text)
Native multimodal pretraining at scale
2x
Inference Speed-Up
Native INT4 quantization with QAT

Key Capabilities

  • ✓ Native Multimodality: Pre-trained on vision-language tokens, excels in visual knowledge, cross-modal reasoning, and agentic tool use grounded in visual inputs
  • ✓ Coding with Vision: Generates code from visual specifications (UI designs, video workflows) and autonomously chains tools for visual data processing
  • ✓ Agent Swarm: Transitions from single-agent scaling to self-directed, coordinated swarm-like execution—decomposes complex tasks into parallel sub-tasks executed by dynamically instantiated, domain-specific agents
  • ✓ Production-Ready Efficiency: Native INT4 quantization achieving lossless 2x speed improvements with 256K context window

Kimi K2.5 API Usage

Endpoint

curl -X POST "https://api.together.xyz/v1/chat/completions" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "moonshotai/Kimi-K2-5",
    "messages": [
      {
        "role": "user",
        "content": "What are some fun things to do in New York?"
      }
    ]
}'
curl -X POST "https://api.together.xyz/v1/images/generations" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "moonshotai/Kimi-K2-5",
    "prompt": "Draw an anime style version of this image.",
    "width": 1024,
    "height": 768,
    "steps": 28,
    "n": 1,
    "response_format": "url",
    "image_url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
  }'
curl -X POST https://api.together.xyz/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
    "model": "moonshotai/Kimi-K2-5",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Describe what you see in this image."},
        {"type": "image_url", "image_url": {"url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"}}
      ]
    }],
    "max_tokens": 512
  }'
curl -X POST https://api.together.xyz/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
    "model": "moonshotai/Kimi-K2-5",
    "messages": [{
      "role": "user",
      "content": "Given two binary strings `a` and `b`, return their sum as a binary string"
    }]
  }'
curl -X POST https://api.together.xyz/v1/rerank \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
    "model": "moonshotai/Kimi-K2-5",
    "query": "What animals can I find near Peru?",
    "documents": [
      "The giant panda (Ailuropoda melanoleuca), also known as the panda bear or simply panda, is a bear species endemic to China.",
      "The llama is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the pre-Columbian era.",
      "The wild Bactrian camel (Camelus ferus) is an endangered species of camel endemic to Northwest China and southwestern Mongolia.",
      "The guanaco is a camelid native to South America, closely related to the llama. Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations."
    ],
    "top_n": 2
  }'
curl -X POST https://api.together.xyz/v1/embeddings \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Our solar system orbits the Milky Way galaxy at about 515,000 mph.",
    "model": "moonshotai/Kimi-K2-5"
  }'
curl -X POST https://api.together.xyz/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
    "model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    "prompt": "A horse is a horse",
    "max_tokens": 32,
    "temperature": 0.1,
    "safety_model": "moonshotai/Kimi-K2-5"
  }'
curl --location 'https://api.together.ai/v1/audio/generations' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer $TOGETHER_API_KEY' \
  --output speech.mp3 \
  --data '{
    "input": "Today is a wonderful day to build something people love!",
    "voice": "helpful woman",
    "response_format": "mp3",
    "sample_rate": 44100,
    "stream": false,
    "model": "moonshotai/Kimi-K2-5"
  }'
curl -X POST "https://api.together.xyz/v1/audio/transcriptions" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -F "model=moonshotai/Kimi-K2-5" \
  -F "language=en" \
  -F "response_format=json" \
  -F "timestamp_granularities=segment"
curl --request POST \
  --url https://api.together.xyz/v2/videos \
  --header "Authorization: Bearer $TOGETHER_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "moonshotai/Kimi-K2-5",
    "prompt": "some penguins building a snowman"
  }'
curl --request POST \
  --url https://api.together.xyz/v2/videos \
  --header "Authorization: Bearer $TOGETHER_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "moonshotai/Kimi-K2-5",
    "frame_images": [{"input_image": "https://cdn.pixabay.com/photo/2020/05/20/08/27/cat-5195431_1280.jpg"}]
  }'

from together import Together

client = Together()

response = client.chat.completions.create(
  model="moonshotai/Kimi-K2-5",
  messages=[
    {
      "role": "user",
      "content": "What are some fun things to do in New York?"
    }
  ]
)
print(response.choices[0].message.content)
from together import Together

client = Together()

imageCompletion = client.images.generate(
    model="moonshotai/Kimi-K2-5",
    width=1024,
    height=768,
    steps=28,
    prompt="Draw an anime style version of this image.",
    image_url="https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png",
)

print(imageCompletion.data[0].url)


from together import Together

client = Together()

response = client.chat.completions.create(
    model="moonshotai/Kimi-K2-5",
    messages=[{
    	"role": "user",
      "content": [
        {"type": "text", "text": "Describe what you see in this image."},
        {"type": "image_url", "image_url": {"url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"}}
      ]
    }]
)
print(response.choices[0].message.content)

from together import Together

client = Together()
response = client.chat.completions.create(
  model="moonshotai/Kimi-K2-5",
  messages=[
  	{
	    "role": "user", 
      "content": "Given two binary strings `a` and `b`, return their sum as a binary string"
    }
 ],
)

print(response.choices[0].message.content)

from together import Together

client = Together()

query = "What animals can I find near Peru?"

documents = [
  "The giant panda (Ailuropoda melanoleuca), also known as the panda bear or simply panda, is a bear species endemic to China.",
  "The llama is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the pre-Columbian era.",
  "The wild Bactrian camel (Camelus ferus) is an endangered species of camel endemic to Northwest China and southwestern Mongolia.",
  "The guanaco is a camelid native to South America, closely related to the llama. Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations.",
]

response = client.rerank.create(
  model="moonshotai/Kimi-K2-5",
  query=query,
  documents=documents,
  top_n=2
)

for result in response.results:
    print(f"Relevance Score: {result.relevance_score}")

from together import Together

client = Together()

response = client.embeddings.create(
  model = "moonshotai/Kimi-K2-5",
  input = "Our solar system orbits the Milky Way galaxy at about 515,000 mph"
)

from together import Together

client = Together()

response = client.completions.create(
  model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
  prompt="A horse is a horse",
  max_tokens=32,
  temperature=0.1,
  safety_model="moonshotai/Kimi-K2-5",
)

print(response.choices[0].text)

from together import Together

client = Together()

speech_file_path = "speech.mp3"

response = client.audio.speech.create(
  model="moonshotai/Kimi-K2-5",
  input="Today is a wonderful day to build something people love!",
  voice="helpful woman",
)
    
response.stream_to_file(speech_file_path)

from together import Together

client = Together()
response = client.audio.transcribe(
    model="moonshotai/Kimi-K2-5",
    language="en",
    response_format="json",
    timestamp_granularities="segment"
)
print(response.text)
from together import Together

client = Together()

# Create a video generation job
job = client.videos.create(
    prompt="A serene sunset over the ocean with gentle waves",
    model="moonshotai/Kimi-K2-5"
)
from together import Together

client = Together()

job = client.videos.create(
    model="moonshotai/Kimi-K2-5",
    frame_images=[
        {
            "input_image": "https://cdn.pixabay.com/photo/2020/05/20/08/27/cat-5195431_1280.jpg",
        }
    ]
)
import Together from 'together-ai';
const together = new Together();

const completion = await together.chat.completions.create({
  model: 'moonshotai/Kimi-K2-5',
  messages: [
    {
      role: 'user',
      content: 'What are some fun things to do in New York?'
     }
  ],
});

console.log(completion.choices[0].message.content);
import Together from "together-ai";

const together = new Together();

async function main() {
  const response = await together.images.create({
    model: "moonshotai/Kimi-K2-5",
    width: 1024,
    height: 1024,
    steps: 28,
    prompt: "Draw an anime style version of this image.",
    image_url: "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png",
  });

  console.log(response.data[0].url);
}

main();

import Together from "together-ai";

const together = new Together();
const imageUrl = "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png";

async function main() {
  const response = await together.chat.completions.create({
    model: "moonshotai/Kimi-K2-5",
    messages: [{
      role: "user",
      content: [
        { type: "text", text: "Describe what you see in this image." },
        { type: "image_url", image_url: { url: imageUrl } }
      ]
    }]
  });
  
  console.log(response.choices[0]?.message?.content);
}

main();

import Together from "together-ai";

const together = new Together();

async function main() {
  const response = await together.chat.completions.create({
    model: "moonshotai/Kimi-K2-5",
    messages: [{
      role: "user",
      content: "Given two binary strings `a` and `b`, return their sum as a binary string"
    }]
  });
  
  console.log(response.choices[0]?.message?.content);
}

main();

import Together from "together-ai";

const together = new Together();

const query = "What animals can I find near Peru?";
const documents = [
  "The giant panda (Ailuropoda melanoleuca), also known as the panda bear or simply panda, is a bear species endemic to China.",
  "The llama is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the pre-Columbian era.",
  "The wild Bactrian camel (Camelus ferus) is an endangered species of camel endemic to Northwest China and southwestern Mongolia.",
  "The guanaco is a camelid native to South America, closely related to the llama. Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations."
];

async function main() {
  const response = await together.rerank.create({
    model: "moonshotai/Kimi-K2-5",
    query: query,
    documents: documents,
    top_n: 2
  });
  
  for (const result of response.results) {
    console.log(`Relevance Score: ${result.relevance_score}`);
  }
}

main();


import Together from "together-ai";

const together = new Together();

const response = await client.embeddings.create({
  model: 'moonshotai/Kimi-K2-5',
  input: 'Our solar system orbits the Milky Way galaxy at about 515,000 mph',
});

import Together from "together-ai";

const together = new Together();

async function main() {
  const response = await together.completions.create({
    model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    prompt: "A horse is a horse",
    max_tokens: 32,
    temperature: 0.1,
    safety_model: "moonshotai/Kimi-K2-5"
  });
  
  console.log(response.choices[0]?.text);
}

main();

import Together from 'together-ai';

const together = new Together();

async function generateAudio() {
   const res = await together.audio.create({
    input: 'Today is a wonderful day to build something people love!',
    voice: 'helpful woman',
    response_format: 'mp3',
    sample_rate: 44100,
    stream: false,
    model: 'moonshotai/Kimi-K2-5',
  });

  if (res.body) {
    console.log(res.body);
    const nodeStream = Readable.from(res.body as ReadableStream);
    const fileStream = createWriteStream('./speech.mp3');

    nodeStream.pipe(fileStream);
  }
}

generateAudio();

import Together from "together-ai";

const together = new Together();

const response = await together.audio.transcriptions.create(
  model: "moonshotai/Kimi-K2-5",
  language: "en",
  response_format: "json",
  timestamp_granularities: "segment"
});
console.log(response)
import Together from "together-ai";

const together = new Together();

async function main() {
  // Create a video generation job
  const job = await together.videos.create({
    prompt: "A serene sunset over the ocean with gentle waves",
    model: "moonshotai/Kimi-K2-5"
  });
import Together from "together-ai";

const together = new Together();

const job = await together.videos.create({
  model: "moonshotai/Kimi-K2-5",
  frame_images: [
    {
      input_image: "https://cdn.pixabay.com/photo/2020/05/20/08/27/cat-5195431_1280.jpg",
    }
  ]
});

How to use Kimi K2.5

Model details

Architecture Overview:
• Mixture-of-Experts (MoE) architecture with 1T total parameters and 32B activated parameters
• 61 total layers including 1 dense layer with 384 experts selecting 8 per token
• Multi-head Latent Attention (MLA) mechanism with 7168 attention hidden dimension
• Native vision encoder: MoonViT with 400M parameters for vision-language integration
• Native INT4 quantization applied to MoE components through Quantization-Aware Training (QAT)
• 256K context window enabling complex long-horizon multimodal agentic tasks
• 160K vocabulary size with SwiGLU activation function
• Unified architecture combining vision and text, instant and thinking modes, conversational and agentic paradigms

Training Methodology:
• Continual pretraining on approximately 15 trillion mixed visual and text tokens atop Kimi-K2-Base
• Native multimodal training—pre-trained on vision-language tokens for seamless cross-modal reasoning
• End-to-end trained to interleave chain-of-thought reasoning with function calls and visual grounding
• Quantization-Aware Training (QAT) employed for lossless INT4 inference with 2x speed
• Agent Swarm training—transitions from single-agent scaling to self-directed, coordinated swarm-like execution
• Specialized training for parallel task decomposition and domain-specific agent instantiation

Key Capabilities:
• Native Multimodality: Excels in visual knowledge, cross-modal reasoning, and agentic tool use grounded in visual inputs
• Coding with Vision: Generates code from visual specifications (UI designs, video workflows) and autonomously chains tools for visual data processing
• Agent Swarm: Decomposes complex tasks into parallel sub-tasks executed by dynamically instantiated, domain-specific agents
• Vision benchmarks: 78.5% MMMU-Pro, 84.2% MathVision, 90.1% MathVista, 77.5% CharXiv reasoning

Performance Characteristics:
• State-of-the-art 50.2% on Humanity's Last Exam (HLE) with tools across 100+ expert subjects
• Advanced mathematical reasoning: 96.1% AIME 2025, 95.4% HMMT 2025, 81.8% IMO-AnswerBench, 87.4% GPQA-Diamond
• Strong coding capabilities: 76.8% SWE-Bench Verified, 73.0% SWE-Bench Multilingual, 85.0% LiveCodeBench v6
• Agentic search with swarm: 78.4% BrowseComp (swarm mode), 57.5% Seal-0
• Long-context excellence: 79.3% on AA-LCR (avg@3), 69.4% LongBench-v2 (128K context)
• 2x generation speed improvement through native INT4 quantization without performance degradation

Prompting Kimi K2.5

Applications & Use Cases

Multimodal Agentic Reasoning:
• Expert-level reasoning across 100+ subjects achieving 50.2% on Humanity's Last Exam with tools
• Vision-grounded reasoning: 78.5% MMMU-Pro, 84.2% MathVision, 90.1% MathVista
• Cross-modal problem solving combining visual understanding with mathematical and logical reasoning
• PhD-level mathematical problem solving: 96.1% AIME 2025, 95.4% HMMT 2025
• Dynamic hypothesis generation from visual and textual inputs with evidence verification

Coding with Vision:
• Generate code from visual specifications: UI designs, mockups, and video workflows
• Autonomous tool chaining for visual data processing and analysis
• Production-level coding: 76.8% SWE-Bench Verified, 73.0% SWE-Bench Multilingual
• Frontend development from visual designs: fully functional HTML, React, and responsive web applications
• Video-to-code generation: analyze video workflows and generate implementation code
• Competitive programming: 85.0% LiveCodeBench v6, 53.6% OJ-Bench

Agent Swarm Orchestration:
• Self-directed task decomposition into parallel sub-tasks
• Dynamically instantiate domain-specific agents for coordinated execution
• Swarm mode performance: 62.3% BrowseComp, 19.4% WideSearch
• Complex research workflows with parallel information gathering and synthesis
• Multi-agent coding projects with specialized sub-agents for different components

Visual Understanding & Analysis:
• Native image and video understanding with 400M parameter MoonViT encoder
• Chart and graph reasoning: 77.5% CharXiv reasoning questions
• Document understanding and visual question answering
• Scientific visualization analysis and interpretation
• UI/UX design understanding for code generation

Agentic Search & Web Reasoning:
• Goal-directed web-based reasoning with visual content understanding
• Continuous browsing, searching, and reasoning over multimodal web information
• 62.3% BrowseComp in swarm mode with coordinated sub-agent exploration
• Visual content extraction and analysis from web sources

Long-Horizon Multimodal Workflows:
• Research automation across text and visual sources
• Video analysis workflows with tool-augmented reasoning
• Complex design-to-implementation pipelines
• Multi-step visual data processing and code generation
• 79.3% AA-LCR (avg@3), 69.4% LongBench-v2 with 128K context

Creative & Multimodal Content Generation:
• Image-grounded creative writing and storytelling
• Visual analysis and cultural commentary
• Technical documentation from visual specifications
• Educational content combining visual and textual explanations

Looking for production scale? Deploy on a dedicated endpoint

Deploy Kimi K2.5 on a dedicated endpoint with custom hardware configuration, as many instances as you need, and auto-scaling.

Get started