gpt-oss-120B

Advanced open reasoning model with enterprise-grade capabilities.

About model

Enterprise-Ready Open Reasoning:
gpt-oss-120B delivers sophisticated chain-of-thought reasoning capabilities in a fully open model. Built with community feedback and released under Apache 2.0, this 120B parameter model provides transparency, customization, and deployment flexibility for organizations requiring complete data security & privacy control.

Quickstart guides

RAG

Building a RAG Workflow

Agents

Agent Workflows

Apps

Next.js Chat Quickstart

Performance benchmarks

Model	GPQA Diamond	HLE	LiveCodeBench	MATH500	SWE-bench verified
gpt-oss-120B					75.8%
Related open-source models
Competitor closed-source models
Claude Opus 4.6	90.5%	34.2%			78.7%
OpenAI o3	83.3%	24.9%		99.2%	62.3%
OpenAI o1	76.8%			96.4%	48.9%
GPT-4o	49.2%	2.7%	32.3%	89.3%	31.0%

API usage

cURL
Python
Typescript

Endpoint:

openai/gpt-oss-120b

curl -X POST "https://api.together.xyz/v1/chat/completions" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-oss-120b",
    "messages": [
      {
        "role": "user",
        "content": "What are some fun things to do in New York?"
      }
    ]
}'

from together import Together

client = Together()

response = client.chat.completions.create(
  model="openai/gpt-oss-120b",
  messages=[
    {
      "role": "user",
      "content": "What are some fun things to do in New York?"
    }
  ]
)
print(response.choices[0].message.content)

import Together from 'together-ai';
const together = new Together();

const completion = await together.chat.completions.create({
  model: 'openai/gpt-oss-120b',
  messages: [
    {
      role: 'user',
      content: 'What are some fun things to do in New York?'
     }
  ],
});

console.log(completion.choices[0].message.content);

Model card
Architecture Overview:
• Mixture-of-Experts (MoE) architecture with SwiGLU activations
• Alternating attention layers between full context and sliding 128-token window
• Learned attention sink per-head for enhanced performance

Training Methodology:
• Comprehensive safety training and evaluation protocols
• Community feedback integration from global listening sessions
• Rigorous testing under Preparedness Framework
• Standard GPT-4o tokenizer with additional Harmony format tokens

Performance Characteristics:
• Native FP4 quantization for efficient inference
• 128K context window with RoPE positional encoding
• Chain-of-thought reasoning with adjustable effort levels
Applications & use cases
Enterprise Applications:
• Complex reasoning and analysis tasks
• Research and development support
• Technical documentation generation
• Strategic planning and decision support

Developer Use Cases:
• Code generation and review
• API development and integration
• System architecture design
• Technical troubleshooting and debugging

Industry Solutions:
• Healthcare: Clinical decision support and medical research
• Finance: Risk analysis and regulatory compliance
• Legal: Contract analysis and legal research
• Education: Curriculum development and tutoring systems

Deployment Scenarios:
• On-premises infrastructure for data sovereignty
• Private cloud deployments for security compliance
• Custom fine-tuning for domain-specific applications
• Multi-modal integration with existing systems

Related models

Model specifications

Model data

Model provider
OpenAI
Type
Reasoning
Chat
Main use cases
Chat
Small & Fast
Medium General Purpose
Features
JSON Mode
Speed
High
Intelligence
High
Deployment
Serverless
On-Demand Dedicated
Monthly Reserved
Endpoint
openai/gpt-oss-120b
Parameters
120B
Context length
128K
Input price
$0.15 / 1M tokens
Output price
$0.60 / 1M tokens
Input modalities
Text
Output modalities
Text

Released
August 4, 2025
Last updated
August 18, 2025
External link
Provider docs
Category
Chat

Run in Playground

Quickstart docs

Deploy model

gpt-oss-120B

About model

API usage

Model card

Applications & use cases