MiniMax M2.7

Production-scale software engineering with long-horizon agentic execution and native Agent Teams

Try now

read docs

About model

MiniMax M2.7 is the first model to meaningfully participate in its own development. An internal version autonomously ran 100+ optimization rounds — analyzing failure trajectories, modifying code, evaluating results, and deciding to keep or revert — achieving a 30% improvement on internal programming benchmarks.

On SWE-Pro, M2.7 scores 56.22%, matching GPT-5.3-Codex, with 55.6% on VIBE-Pro (near Opus 4.6) for end-to-end project delivery across Web, Android, and iOS. On MLE Bench Lite, M2.7 achieved a 66.6% medal rate — second only to Opus-4.6 and GPT-5.4. Native agent teams enable stable multi-agent collaboration with role identity and autonomous decision-making across complex state machines, with 97% skill compliance across 40+ complex skills on Together AI's production infrastructure.

SWE-Pro

56.22%

Software engineering across multilingual, real-world codebases

MLE Bench Lite medal rate

66.6%

2nd only to Opus-4.6 and GPT-5.4 across 22 ML competitions

Autonomous optimization rounds

100+

Self-directed RL loop achieving 30% improvement on internal benchmarks

Model key capabilities

Software engineering: 56.22% SWE-Pro matching GPT-5.3-Codex; 76.5 SWE Multilingual; 55.6% VIBE-Pro near Opus 4.6 for end-to-end project delivery across Web, Android, and iOS
Model self-evolution: Autonomously ran 100+ optimization rounds achieving 30% performance improvement; 66.6% MLE Bench Lite medal rate, second only to Opus-4.6 and GPT-5.4
Native agent teams: Multi-agent collaboration with stable role identity and autonomous decision-making; 97% skill compliance across 40+ complex skills (each 2,000+ tokens)
Professional work: ELO 1495 on GDPval-AA, highest among open-source models; high-fidelity multi-round editing for Word, Excel, and PPT
Production-ready infrastructure: 99.9% SLA, serverless and dedicated infrastructure on the AI Native Cloud

API usage

cURL
Python
Typescript

Endpoint:

MiniMaxAI/MiniMax-M2.7

curl -X POST "https://api.together.xyz/v1/chat/completions" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "MiniMaxAI/MiniMax-M2.7",
    "messages": [
      {
        "role": "user",
        "content": "What are some fun things to do in New York?"
      }
    ]
}'

from together import Together

client = Together()

response = client.chat.completions.create(
  model="MiniMaxAI/MiniMax-M2.7",
  messages=[
    {
      "role": "user",
      "content": "What are some fun things to do in New York?"
    }
  ]
)
print(response.choices[0].message.content)

import Together from 'together-ai';
const together = new Together();

const completion = await together.chat.completions.create({
  model: 'MiniMaxAI/MiniMax-M2.7',
  messages: [
    {
      role: 'user',
      content: 'What are some fun things to do in New York?'
     }
  ],
});

console.log(completion.choices[0].message.content);

Model card
Training through self-optimization
During development, M2.7 participated in its own training: updating its own memory, building complex skills for RL experiments, and improving its learning process based on results
During training, an internal version autonomously optimized a programming scaffold over 100+ rounds — analyzing failure trajectories, modifying code, running evaluations, and deciding to keep or revert — achieving a 30% performance improvement
MLE Bench Lite (22 ML competitions): 66.6% medal rate, second only to Opus-4.6 (75.7%) and GPT-5.4 (71.2%), tying with Gemini-3.1
Professional software engineering
SWE-Pro: 56.22%, matching GPT-5.3-Codex across multiple programming languages
VIBE-Pro: 55.6%, near Opus 4.6 — end-to-end project delivery across Web, Android, iOS, and simulation
SWE Multilingual: 76.5 | Multi SWE Bench: 52.7 | Terminal Bench 2: 57.0% | NL2Repo: 39.8%
Native agent teams with stable role identity and autonomous decision-making across complex state machines
System-level reasoning: Correlates monitoring metrics, conducts trace analysis, verifies root causes in databases, makes SRE-level decisions — live production incident recovery reduced to under three minutes
Professional work
GDPval-AA ELO: 1495 — highest among open-source models, surpassing GPT-5.3
High-fidelity multi-round editing for Word, Excel, and PPT, producing editable deliverables
Toolathon: 46.3% accuracy, global top tier
MM Claw: 62.7%, close to Sonnet 4.6 | 97% skill compliance across 40+ complex skills (each exceeding 2,000 tokens)
‍
Applications & use cases
Professional software engineering:
SWE-Pro: 56.22%, matching GPT-5.3-Codex across multiple programming languages
End-to-end project delivery: 55.6% VIBE-Pro, near Opus 4.6—Web, Android, iOS, and simulation tasks
System-level reasoning: correlates monitoring metrics, conducts trace analysis, verifies root causes in databases, and makes SRE-level decisions
Real-world incident recovery reduced to under three minutes
Terminal Bench 2: 57.0% | SWE Multilingual: 76.5 | NL2Repo: 39.8%
Long-horizon agentic execution:
Sustains progress across hundreds of rounds and thousands of tool calls
66.6% medal rate on MLE Bench Lite (22 ML competitions)—second only to Opus-4.6 and GPT-5.4
Trained via recursive self-optimization: 100+ autonomous rounds of analyze → modify → evaluate → keep or revert during development
30% improvement achieved through that self-directed training loop
Native Agent Teams:
Multi-agent collaboration with stable role identity and autonomous decision-making
Adversarial reasoning, protocol adherence, and behavioral differentiation as native model capabilities
97% skill compliance across 40+ complex skills, each exceeding 2,000 tokens
MM Claw: 62.7%, close to Sonnet 4.6
Professional work:
GDPval-AA ELO: 1495—highest among open-source models, surpassing GPT-5.3
High-fidelity multi-round editing for Word, Excel, and PPT producing editable deliverables
Toolathon: 46.3% accuracy, global top tier
Financial modeling: reads annual reports, cross-references research reports, builds revenue forecast models and PPT/Word deliverables autonomously

Related models

Model specifications

Model data

Model provider
Minimax AI
Type
Reasoning
Code
Chat
Main use cases
Chat
Coding Agents
Function Calling
Reasoning
Features
Function Calling
JSON Mode
Prompt Caching
Speed
Medium
Intelligence
Very High
Deployment
Monthly Reserved
Serverless
Endpoint
MiniMaxAI/MiniMax-M2.7
Parameters
229B
Context length
228700
Input price
$0.30 / 1M tokens
$0.06 (cached)/1M
Output price
$1.20 / 1M tokens
Input modalities
Text
Output modalities
Text

Released
April 11, 2026
Quantization level
FP4
Category
Chat

Run in Playground

Quickstart docs

Deploy model

MiniMax M2.7

About model

API usage

Model card

Applications & use cases