GPT Image 2

Flagship image generation and editing model with built-in reasoning and layout control

About model

GPT Image 2 is OpenAI's flagship image generation and editing model, released April 21, 2026 as the successor to GPT Image 1. It is the first OpenAI image model with built-in reasoning capabilities, and accepts both text prompts and up to 16 reference images per call for reference-guided generation and in-context editing. The model's headline improvements are strong prompt adherence, photorealistic rendering, and significantly improved text legibility — including readable embedded type in signs, labels, UI elements, and structured visual layouts. Multilingual text rendering covers Latin, Chinese, Japanese, Korean, Hindi, Bengali, and Arabic scripts at above 95% accuracy per OpenAI. Outputs span text-to-image generation and targeted image editing, across resolutions from 1K to 4K and a wide range of aspect ratios.

Text Rendering Accuracy

95%+

Across Latin, Chinese, Japanese, Korean, Hindi, Bengali, and Arabic scripts

Max Reference Images

Multi-modal context support up to 100 MB per reference image

Output Resolution

Native output spanning 1K, 2K, and 4K quality tiers

Model key capabilities

Text in Images: OpenAI reports above 95% text rendering accuracy across Latin, Chinese, Japanese, Korean, Hindi, Bengali, and Arabic scripts — a meaningful improvement over GPT Image 1, where embedded text was a persistent failure mode. Readable signs, labels, UI elements, and multi-word strings are first-class outputs.
Reference-Guided Generation: Accepts up to 16 reference images per call (up to 100 MB each), enabling style transfer, product comp consistency, and iterative editing workflows without fine-tuning — a direct API input rather than a separate workflow step.
Photorealism and Style Range: OpenAI describes improvements to reflections, materials, lighting, and photographic fidelity, alongside strong coverage of non-photographic styles including illustration, manga, pixel art, and structured layout-sensitive outputs such as posters, packaging, and product comps.
Structured Visual Outputs: Designed to produce layout-sensitive deliverables — posters, packaging, diagrams, infographics, magazine spreads, and product renderings — where spatial composition and text placement need to be precise, not approximate.

API usage

cURL
Python
Typescript

Endpoint:

openai/gpt-image-2

curl -X POST "https://api.together.xyz/v1/images/generations" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-image-2",
    "prompt": "Draw an anime style version of this image.",
    "width": 1024,
    "height": 768,
    "steps": 28,
    "n": 1,
    "response_format": "url",
    "image_url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
  }'

from together import Together

client = Together()

imageCompletion = client.images.generate(
    model="openai/gpt-image-2",
    width=1024,
    height=768,
    steps=28,
    prompt="Draw an anime style version of this image.",
    image_url="https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png",
)

print(imageCompletion.data[0].url)

import Together from "together-ai";

const together = new Together();

async function main() {
  const response = await together.images.create({
    model: "openai/gpt-image-2",
    width: 1024,
    height: 1024,
    steps: 28,
    prompt: "Draw an anime style version of this image.",
    image_url: "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png",
  });

  console.log(response.data[0].url);
}

main();

Model card
Architecture Overview:
• Proprietary vision-generation transformer architecture with built-in semantic reasoning
• Supports multi-modal text and image inputs (up to 16 reference images simultaneously)
• Native multi-resolution generation engine stretching up to 4K outputs
• Flexible aspect ratio support including standard 1:1, 3:2, 2:3, 4:3, 3:4, 4:5, 5:4, 9:16, and 16:9 distributions

Features & Interface:
• In-context canvas control allowing direct image-to-image editing, background replacement, and asset variations
• Native text layer compositing enabling precise localized letter styling
• Multi-reference formatting handles style weights, structure masks, and layout baselines

Performance Characteristics:
• Drastic reductions in text fragmentation errors compared to legacy image models
• High spatial awareness for text alignment, boundary limits, and multi-line formatting grids
• Robust multilingual capability handling intricate character line structures like Hindi and Arabic scripts
‍
Prompting
Together AI API Access:
• Access GPT Image 2 via Together AI APIs using the endpoint openai/gpt-image-2
• Authenticate using your Together AI API key in request headers
• Pass text instructions or array payloads containing reference image URLs up to a limit of 16 assets
• Configured for $0.053 per asset iteration on serverless infrastructure
‍
Applications & use cases
Marketing & Graphic Design:
• Generate posters, ad banners, and promotional cards with perfectly typeset copy
• Localize promotional assets into multiple scripts automatically via native multilingual font rendering

E-commerce & Branding:
• Keep product presentation uniform by passing current templates as reference frames
• Construct detailed mockups, complex packaging visuals, and catalog assets matching target brand style guidelines

Editorial Production:
• Develop multi-element compositions, clean diagrams, information charts, and text-embedded infographics
• Author consistent stylistic illustrations for storytelling books, websites, and application backdrops
‍

Related models

Model specifications

Model data

Model provider
OpenAI
Type
Image
Reasoning
Main use cases
Image Generation
Resolution/Duration
1K; 2K; 4K
Deployment
Serverless
Endpoint
openai/gpt-image-2
Price
$0.053 / image /per-image
Input modalities
Text
Image
Output modalities
Image

Released
April 20, 2026
Category
Image

Run in Playground

Quickstart docs

Deploy model

GPT Image 2

About model

API usage

Model card

Prompting

Applications & use cases