Models / Deep CogitoCogito / / Cogito v2.1 671B API

Cogito v2.1 671B API

Advanced hybrid reasoning model with self-improving capabilities

Try Now

New

This model is not currently supported on Together AI.

Visit our Models page to view all the latest models.

Introducing Cogito v2.1 671B

Cogito v2.1 671B is Deep Cogito's flagship open-source hybrid reasoning model, built on Iterated Distillation and Amplification (IDA) that learns to think better through self-improvement. Outperforming all US open models and rivaling Claude 4 Opus and O3, Cogito v2.1 achieves frontier-level performance while using 60% shorter reasoning chains than competitors — delivering breakthrough efficiency with 4,894 average tokens per response (lowest among frontier models) at just $1.25 per million tokens.

89.47%

AIME 2025 (Competition Math)

Elite mathematical reasoning outperforming models 10x larger

60%

More Efficient Reasoning

Shorter chains than DeepSeek R1 with equal accuracy

4,894

Avg Tokens per Response

Lowest among all frontier models for massive cost savings

Key Capabilities

✓ Hybrid Reasoning Modes: Seamlessly switch between fast standard responses and deep step-by-step reasoning with visible thought chains for complex problem-solving

✓ Self-Improving Intelligence: IDA methodology distills reasoning discoveries back into parameters, developing machine intuition that anticipates outcomes rather than just searching longer

✓ State-of-the-Art Benchmarks: 98.57% MATH-500, 77.72% GPQA Diamond, 84.69% MMLU Pro, 86.24% Multilingual MMLU across 30+ languages

✓ Production-Ready Efficiency: 128K context window, OpenAI-compatible API, native tool calling support, and $1.25/1M tokens pricing on Together AI

Cogito v2.1 671B API Usage

Endpoint

deepcogito/cogito-v2-1-671b

curl -X POST "https://api.together.xyz/v1/chat/completions" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepcogito/cogito-v2-1-671b",
    "messages": [
      {
        "role": "user",
        "content": "What are some fun things to do in New York?"
      }
    ]
}'

curl -X POST "https://api.together.xyz/v1/images/generations" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepcogito/cogito-v2-1-671b",
    "prompt": "Draw an anime style version of this image.",
    "width": 1024,
    "height": 768,
    "steps": 28,
    "n": 1,
    "response_format": "url",
    "image_url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
  }'

curl -X POST https://api.together.xyz/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
    "model": "deepcogito/cogito-v2-1-671b",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Describe what you see in this image."},
        {"type": "image_url", "image_url": {"url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"}}
      ]
    }],
    "max_tokens": 512
  }'

curl -X POST https://api.together.xyz/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
    "model": "deepcogito/cogito-v2-1-671b",
    "messages": [{
      "role": "user",
      "content": "Given two binary strings `a` and `b`, return their sum as a binary string"
    }]
  }'

curl -X POST https://api.together.xyz/v1/rerank \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
    "model": "deepcogito/cogito-v2-1-671b",
    "query": "What animals can I find near Peru?",
    "documents": [
      "The giant panda (Ailuropoda melanoleuca), also known as the panda bear or simply panda, is a bear species endemic to China.",
      "The llama is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the pre-Columbian era.",
      "The wild Bactrian camel (Camelus ferus) is an endangered species of camel endemic to Northwest China and southwestern Mongolia.",
      "The guanaco is a camelid native to South America, closely related to the llama. Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations."
    ],
    "top_n": 2
  }'

curl -X POST https://api.together.xyz/v1/embeddings \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Our solar system orbits the Milky Way galaxy at about 515,000 mph.",
    "model": "deepcogito/cogito-v2-1-671b"
  }'

curl -X POST https://api.together.xyz/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
    "model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    "prompt": "A horse is a horse",
    "max_tokens": 32,
    "temperature": 0.1,
    "safety_model": "deepcogito/cogito-v2-1-671b"
  }'

curl --location 'https://api.together.ai/v1/audio/generations' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer $TOGETHER_API_KEY' \
  --output speech.mp3 \
  --data '{
    "input": "Today is a wonderful day to build something people love!",
    "voice": "helpful woman",
    "response_format": "mp3",
    "sample_rate": 44100,
    "stream": false,
    "model": "deepcogito/cogito-v2-1-671b"
  }'

curl -X POST "https://api.together.xyz/v1/audio/transcriptions" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -F "model=deepcogito/cogito-v2-1-671b" \
  -F "language=en" \
  -F "response_format=json" \
  -F "timestamp_granularities=segment"

curl --request POST \
  --url https://api.together.xyz/v2/videos \
  --header "Authorization: Bearer $TOGETHER_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "deepcogito/cogito-v2-1-671b",
    "prompt": "some penguins building a snowman"
  }'

curl --request POST \
  --url https://api.together.xyz/v2/videos \
  --header "Authorization: Bearer $TOGETHER_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "deepcogito/cogito-v2-1-671b",
    "frame_images": [{"input_image": "https://cdn.pixabay.com/photo/2020/05/20/08/27/cat-5195431_1280.jpg"}]
  }'

from together import Together

client = Together()

response = client.chat.completions.create(
  model="deepcogito/cogito-v2-1-671b",
  messages=[
    {
      "role": "user",
      "content": "What are some fun things to do in New York?"
    }
  ]
)
print(response.choices[0].message.content)

from together import Together

client = Together()

imageCompletion = client.images.generate(
    model="deepcogito/cogito-v2-1-671b",
    width=1024,
    height=768,
    steps=28,
    prompt="Draw an anime style version of this image.",
    image_url="https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png",
)

print(imageCompletion.data[0].url)

from together import Together

client = Together()

response = client.chat.completions.create(
    model="deepcogito/cogito-v2-1-671b",
    messages=[{
    	"role": "user",
      "content": [
        {"type": "text", "text": "Describe what you see in this image."},
        {"type": "image_url", "image_url": {"url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"}}
      ]
    }]
)
print(response.choices[0].message.content)

from together import Together

client = Together()
response = client.chat.completions.create(
  model="deepcogito/cogito-v2-1-671b",
  messages=[
  	{
	    "role": "user", 
      "content": "Given two binary strings `a` and `b`, return their sum as a binary string"
    }
 ],
)

print(response.choices[0].message.content)

from together import Together

client = Together()

query = "What animals can I find near Peru?"

documents = [
  "The giant panda (Ailuropoda melanoleuca), also known as the panda bear or simply panda, is a bear species endemic to China.",
  "The llama is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the pre-Columbian era.",
  "The wild Bactrian camel (Camelus ferus) is an endangered species of camel endemic to Northwest China and southwestern Mongolia.",
  "The guanaco is a camelid native to South America, closely related to the llama. Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations.",
]

response = client.rerank.create(
  model="deepcogito/cogito-v2-1-671b",
  query=query,
  documents=documents,
  top_n=2
)

for result in response.results:
    print(f"Relevance Score: {result.relevance_score}")

from together import Together

client = Together()

response = client.embeddings.create(
  model = "deepcogito/cogito-v2-1-671b",
  input = "Our solar system orbits the Milky Way galaxy at about 515,000 mph"
)

from together import Together

client = Together()

response = client.completions.create(
  model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
  prompt="A horse is a horse",
  max_tokens=32,
  temperature=0.1,
  safety_model="deepcogito/cogito-v2-1-671b",
)

print(response.choices[0].text)

from together import Together

client = Together()

speech_file_path = "speech.mp3"

response = client.audio.speech.create(
  model="deepcogito/cogito-v2-1-671b",
  input="Today is a wonderful day to build something people love!",
  voice="helpful woman",
)
    
response.stream_to_file(speech_file_path)

from together import Together

client = Together()
response = client.audio.transcribe(
    model="deepcogito/cogito-v2-1-671b",
    language="en",
    response_format="json",
    timestamp_granularities="segment"
)
print(response.text)

from together import Together

client = Together()

# Create a video generation job
job = client.videos.create(
    prompt="A serene sunset over the ocean with gentle waves",
    model="deepcogito/cogito-v2-1-671b"
)

from together import Together

client = Together()

job = client.videos.create(
    model="deepcogito/cogito-v2-1-671b",
    frame_images=[
        {
            "input_image": "https://cdn.pixabay.com/photo/2020/05/20/08/27/cat-5195431_1280.jpg",
        }
    ]
)

import Together from 'together-ai';
const together = new Together();

const completion = await together.chat.completions.create({
  model: 'deepcogito/cogito-v2-1-671b',
  messages: [
    {
      role: 'user',
      content: 'What are some fun things to do in New York?'
     }
  ],
});

console.log(completion.choices[0].message.content);

import Together from "together-ai";

const together = new Together();

async function main() {
  const response = await together.images.create({
    model: "deepcogito/cogito-v2-1-671b",
    width: 1024,
    height: 1024,
    steps: 28,
    prompt: "Draw an anime style version of this image.",
    image_url: "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png",
  });

  console.log(response.data[0].url);
}

main();

import Together from "together-ai";

const together = new Together();
const imageUrl = "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png";

async function main() {
  const response = await together.chat.completions.create({
    model: "deepcogito/cogito-v2-1-671b",
    messages: [{
      role: "user",
      content: [
        { type: "text", text: "Describe what you see in this image." },
        { type: "image_url", image_url: { url: imageUrl } }
      ]
    }]
  });
  
  console.log(response.choices[0]?.message?.content);
}

main();

import Together from "together-ai";

const together = new Together();

async function main() {
  const response = await together.chat.completions.create({
    model: "deepcogito/cogito-v2-1-671b",
    messages: [{
      role: "user",
      content: "Given two binary strings `a` and `b`, return their sum as a binary string"
    }]
  });
  
  console.log(response.choices[0]?.message?.content);
}

main();

import Together from "together-ai";

const together = new Together();

const query = "What animals can I find near Peru?";
const documents = [
  "The giant panda (Ailuropoda melanoleuca), also known as the panda bear or simply panda, is a bear species endemic to China.",
  "The llama is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the pre-Columbian era.",
  "The wild Bactrian camel (Camelus ferus) is an endangered species of camel endemic to Northwest China and southwestern Mongolia.",
  "The guanaco is a camelid native to South America, closely related to the llama. Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations."
];

async function main() {
  const response = await together.rerank.create({
    model: "deepcogito/cogito-v2-1-671b",
    query: query,
    documents: documents,
    top_n: 2
  });
  
  for (const result of response.results) {
    console.log(`Relevance Score: ${result.relevance_score}`);
  }
}

main();

import Together from "together-ai";

const together = new Together();

const response = await client.embeddings.create({
  model: 'deepcogito/cogito-v2-1-671b',
  input: 'Our solar system orbits the Milky Way galaxy at about 515,000 mph',
});

import Together from "together-ai";

const together = new Together();

async function main() {
  const response = await together.completions.create({
    model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    prompt: "A horse is a horse",
    max_tokens: 32,
    temperature: 0.1,
    safety_model: "deepcogito/cogito-v2-1-671b"
  });
  
  console.log(response.choices[0]?.text);
}

main();

import Together from 'together-ai';

const together = new Together();

async function generateAudio() {
   const res = await together.audio.create({
    input: 'Today is a wonderful day to build something people love!',
    voice: 'helpful woman',
    response_format: 'mp3',
    sample_rate: 44100,
    stream: false,
    model: 'deepcogito/cogito-v2-1-671b',
  });

  if (res.body) {
    console.log(res.body);
    const nodeStream = Readable.from(res.body as ReadableStream);
    const fileStream = createWriteStream('./speech.mp3');

    nodeStream.pipe(fileStream);
  }
}

generateAudio();

import Together from "together-ai";

const together = new Together();

const response = await together.audio.transcriptions.create(
  model: "deepcogito/cogito-v2-1-671b",
  language: "en",
  response_format: "json",
  timestamp_granularities: "segment"
});
console.log(response)

import Together from "together-ai";

const together = new Together();

async function main() {
  // Create a video generation job
  const job = await together.videos.create({
    prompt: "A serene sunset over the ocean with gentle waves",
    model: "deepcogito/cogito-v2-1-671b"
  });

import Together from "together-ai";

const together = new Together();

const job = await together.videos.create({
  model: "deepcogito/cogito-v2-1-671b",
  frame_images: [
    {
      input_image: "https://cdn.pixabay.com/photo/2020/05/20/08/27/cat-5195431_1280.jpg",
    }
  ]
});

How to use Cogito v2.1 671B

Model details

Architecture Overview:
• Cogito v2.1 671B employs a sophisticated Mixture-of-Experts (MoE) architecture with 671 billion total parameters, utilizing sparse routing mechanisms to activate only specialized expert subnetworks per token, enabling massive scale without proportional compute costs
• Features a 128K token context window optimized for long-form reasoning, technical documentation, and multi-turn conversations
• Implements a hybrid inference system supporting both standard mode (direct answers using internalized "intuition") and reasoning mode (step-by-step self-reflection with visible thought chains)
• Optimized for efficient serverless deployment on Together AI's infrastructure

Training Methodology - Iterated Distillation & Amplification (IDA):
• Revolutionary self-improvement approach where the model runs reasoning chains during training, then is trained on its own intermediate thoughts to develop stronger "machine intuition"
• Unlike traditional models that rely on extended inference-time reasoning, Cogito distills successful reasoning patterns directly into model parameters
• Training process explicitly rewards shorter, more efficient reasoning paths while discouraging unnecessary computational detours
• Trained on multilingual datasets spanning 30+ languages with emphasis on coding, STEM, instruction following, and general helpfulness
• Total training cost remarkably achieved at under $3.5 million for the entire Cogito family (3B to 671B), demonstrating unprecedented cost efficiency

Performance Characteristics:
• AIME 2025 (Competition Mathematics): 89.47% - outperforming models 10x larger
• MATH-500 benchmark: 98.57% accuracy
• GPQA Diamond (Scientific Reasoning): 77.72%
• SWE-Bench Verified (Coding): 42.00% solve rate
• MMLU Pro (Reasoning & Knowledge): 84.69%
• Multilingual MMLU: 86.24% across 30+ languages
• Average token efficiency: 4,894 tokens per response (lowest among frontier models)
• Performance competitive with DeepSeek v3, matching or exceeding latest 0528 model while using 60% shorter reasoning chains
• Approaches capabilities of closed models like Claude 4 Opus, O3, and GPT-5 across diverse benchmarks
• Demonstrates emergent multimodal reasoning capabilities, able to reason about images despite not being explicitly trained for visual tasks

‍

Prompting Cogito v2.1 671B

Applications & Use Cases

High-Performance Use Cases:
• Advanced Mathematical Problem Solving: Superior performance on competition mathematics (AIME 2025: 89.47%), calculus, optimization problems, and quantitative analysis
• Software Engineering & Code Generation: 42% solve rate on SWE-Bench demonstrates strong debugging, code review, and system design capabilities
• Scientific Research & STEM: 77.72% on GPQA Diamond showcases expertise in physics, chemistry, biology, and interdisciplinary scientific reasoning
• Multilingual Applications: 86.24% on Multilingual MMLU enables global deployment across 30+ languages with native-level comprehension
• Legal & Policy Analysis: Reasoning mode excels at applying precedents, analyzing case law, and providing nuanced legal interpretations

Enterprise Applications:
• Intelligent Document Processing: 128K context window handles entire technical documents, contracts, research papers in single context
• Customer Support Automation: Hybrid mode allows fast responses for simple queries, deep reasoning for complex troubleshooting
• Financial Analysis & Risk Assessment: Strong quantitative reasoning combined with efficient token usage for cost-effective at-scale deployment
• Educational Technology: Step-by-step reasoning mode ideal for tutoring, homework help, and adaptive learning systems
• Research Assistance: Frontier performance at $1.25/1M tokens makes large-scale research analysis economically viable

Developer & Research Applications:
• Rapid Prototyping: Together AI's serverless platform enables instant deployment without infrastructure setup
• Model Experimentation: Compare standard vs reasoning modes in real-time via playground interface
• Benchmark Development: Performance approaching closed frontier models enables reproducible research
• Scalable Research: Serverless infrastructure scales automatically for large-scale experiments

Cost-Sensitive Deployments:
• High-Volume Production: Lowest token usage (4,894 avg) among frontier models translates to 20-40% cost savings vs alternatives
• Serverless Efficiency: Pay-per-use pricing on Together AI eliminates infrastructure costs and management overhead
• Startup & SMB Applications: Frontier capabilities at accessible pricing ($1.25/1M tokens) democratizes advanced AI
• Auto-scaling: Together AI's serverless infrastructure automatically handles traffic spikes without manual intervention

Unique Capabilities:
• Emergent Image Reasoning: Despite no explicit visual training, demonstrates ability to reason about images when presented in context
• Efficiency-First Design: 60% shorter reasoning chains mean faster responses and lower costs without sacrificing accuracy
• Hybrid Intelligence: Seamlessly switch between fast intuition and deep deliberation based on query complexity

‍

Model Provider:

Deep Cogito

Type:

Chat

Variant:

Parameters:

671B (MoE)

Deployment:

✔ Serverless

✔ On-Demand Dedicated

✔ Monthly Reserved

Quantization

FP16

Context length:

128K

Resolution / Duration

Pricing:

Check pricing

Run in playground

Deploy model

Quickstart docs