Gemini 3.1 Flash Image (Nano Banana 2)

Pro-quality image generation at Flash speed with search grounding

About model

Gemini 3.1 Flash Image is Google's image generation model combining Pro-level quality with Flash-level speed. It delivers 94% text rendering accuracy, supports up to 5 consistent characters per scene, and handles complex compositions with up to 14 distinct objects. The model leverages Gemini's real-world knowledge base and real-time web search to generate accurate renderings of specific subjects, infographics, and data visualizations, with native 4K output and creative styles including anime, concept art, illustration, and watercolor.

Text Rendering Accuracy

94%

Precise typography for signs, labels, UI mockups, and logos

Characters Per Scene

Consistent identity preservation across multi-character compositions

Native Output

From 512px to 4096x4096 with flexible aspect ratios

Model key capabilities

Precision Text Rendering: 94% accuracy for text in images including signs, labels, UI mockups, logos, and marketing materials
Search-Grounded Generation: Leverages Gemini's knowledge base and real-time web search for factually accurate renderings of specific subjects, infographics, and data visualizations
Multi-Character & Object Fidelity: Up to 5 consistent characters per scene and 14 distinct objects per composition with identity preservation
Creative Versatility: Near-photographic quality plus anime, concept art, illustration, and watercolor styles with native 4K output and conversational editing

API usage

cURL
Python
Typescript

Endpoint:

google/flash-image-3.1

curl -X POST "https://api.together.xyz/v1/images/generations" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/flash-image-3.1",
    "prompt": "Draw an anime style version of this image.",
    "width": 1024,
    "height": 768,
    "steps": 28,
    "n": 1,
    "response_format": "url",
    "image_url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
  }'

from together import Together

client = Together()

imageCompletion = client.images.generate(
    model="google/flash-image-3.1",
    width=1024,
    height=768,
    steps=28,
    prompt="Draw an anime style version of this image.",
    image_url="https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png",
)

print(imageCompletion.data[0].url)

import Together from "together-ai";

const together = new Together();

async function main() {
  const response = await together.images.create({
    model: "google/flash-image-3.1",
    width: 1024,
    height: 1024,
    steps: 28,
    prompt: "Draw an anime style version of this image.",
    image_url: "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png",
  });

  console.log(response.data[0].url);
}

main();

Model card
Architecture Overview:
• Based on Gemini 3 Flash with image generation and conversational editing capabilities
• Combines Pro-level quality with Flash-level speed for rapid iteration
• Native 4K output (up to 4096x4096) across flexible aspect ratios from 512px
• Search grounding using Gemini's real-world knowledge base and real-time web search
• Reference image input for conversational editing workflows

Training Methodology:
• Built on Gemini 3 Flash foundation with specialized image generation training
• Optimized for text rendering accuracy, character consistency, and multi-object composition
• Trained with Gemini's knowledge base for factually grounded image generation
• SynthID watermarking with C2PA Content Credentials for AI content provenance

Performance Characteristics:
• 94% text rendering accuracy for signs, labels, UI mockups, and logos
• Up to 5 consistent characters per scene with identity preservation
• Complex compositions with up to 14 distinct objects rendered accurately
• Near-photographic quality with natural lighting and depth of field
• Creative style support including anime, concept art, illustration, and watercolor
‍
Prompting
Together AI API Access:
• Access Gemini 3.1 Flash Image via Together AI APIs using the endpoint google/flash-image-3.1
• Authenticate using your Together AI API key in request headers
• Control output dimensions with height/width parameters (default 1024x1024, up to 4K)
• Use reference_images for text+image conversational editing workflows
• Supports up to 4 outputs per request
‍
Applications & use cases
Marketing & Design:
• Marketing mockups and brand assets with 94% text rendering accuracy
• Posters, greeting cards, and social media content with precise typography
• Product photography with natural lighting and photorealistic quality

Infographics & Data Visualization:
• Data-driven infographics grounded in Gemini's real-world knowledge
• Diagrams and visual explanations with accurate text labels
• Educational materials with factually grounded visual content

Creative & Editorial:
• Multi-character scenes with consistent identity across up to 5 characters
• Complex compositions with up to 14 distinct objects
• Creative styles including anime, concept art, illustration, and watercolor
• Conversational editing via reference images for iterative refinement
‍

Related models

Model specifications

Model data

Model provider
Google
Type
Image
Main use cases
Image Generation
Resolution/Duration
512px to 4096x4096
Deployment
Serverless
Endpoint
google/flash-image-3.1
Price
$0.05 / image
Input modalities
Text
Image
Output modalities
Image

Category
Image

Run in Playground

Quickstart docs

Deploy model

Gemini 3.1 Flash Image (Nano Banana 2)

About model

API usage

Model card

Prompting

Applications & use cases