Models / Google
Image

Gemini 3.1 Flash Image (Nano Banana 2)

Pro-quality image generation at Flash speed with search grounding

About model

Gemini 3.1 Flash Image is Google's image generation model combining Pro-level quality with Flash-level speed. It delivers 94% text rendering accuracy, supports up to 5 consistent characters per scene, and handles complex compositions with up to 14 distinct objects. The model leverages Gemini's real-world knowledge base and real-time web search to generate accurate renderings of specific subjects, infographics, and data visualizations, with native 4K output and creative styles including anime, concept art, illustration, and watercolor.

Text Rendering Accuracy

94%

Precise typography for signs, labels, UI mockups, and logos

Characters Per Scene

5

Consistent identity preservation across multi-character compositions

Native Output

4K

From 512px to 4096x4096 with flexible aspect ratios

Model key capabilities
  • Precision Text Rendering: 94% accuracy for text in images including signs, labels, UI mockups, logos, and marketing materials
  • Search-Grounded Generation: Leverages Gemini's knowledge base and real-time web search for factually accurate renderings of specific subjects, infographics, and data visualizations
  • Multi-Character & Object Fidelity: Up to 5 consistent characters per scene and 14 distinct objects per composition with identity preservation
  • Creative Versatility: Near-photographic quality plus anime, concept art, illustration, and watercolor styles with native 4K output and conversational editing
  • API usage

    • cURL
    • Python
    • Typescript

    Endpoint:

    google/flash-image-3.1

    curl -X POST "https://api.together.xyz/v1/images/generations" \
      -H "Authorization: Bearer $TOGETHER_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "google/flash-image-3.1",
        "prompt": "Draw an anime style version of this image.",
        "width": 1024,
        "height": 768,
        "steps": 28,
        "n": 1,
        "response_format": "url",
        "image_url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
      }'
    
    from together import Together
    
    client = Together()
    
    imageCompletion = client.images.generate(
        model="google/flash-image-3.1",
        width=1024,
        height=768,
        steps=28,
        prompt="Draw an anime style version of this image.",
        image_url="https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png",
    )
    
    print(imageCompletion.data[0].url)
    
    
    
    import Together from "together-ai";
    
    const together = new Together();
    
    async function main() {
      const response = await together.images.create({
        model: "google/flash-image-3.1",
        width: 1024,
        height: 1024,
        steps: 28,
        prompt: "Draw an anime style version of this image.",
        image_url: "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png",
      });
    
      console.log(response.data[0].url);
    }
    
    main();
    
    
  • Model card

    Architecture Overview:
    • Based on Gemini 3 Flash with image generation and conversational editing capabilities
    • Combines Pro-level quality with Flash-level speed for rapid iteration
    • Native 4K output (up to 4096x4096) across flexible aspect ratios from 512px
    • Search grounding using Gemini's real-world knowledge base and real-time web search
    • Reference image input for conversational editing workflows

    Training Methodology:
    • Built on Gemini 3 Flash foundation with specialized image generation training
    • Optimized for text rendering accuracy, character consistency, and multi-object composition
    • Trained with Gemini's knowledge base for factually grounded image generation
    • SynthID watermarking with C2PA Content Credentials for AI content provenance

    Performance Characteristics:
    • 94% text rendering accuracy for signs, labels, UI mockups, and logos
    • Up to 5 consistent characters per scene with identity preservation
    • Complex compositions with up to 14 distinct objects rendered accurately
    • Near-photographic quality with natural lighting and depth of field
    • Creative style support including anime, concept art, illustration, and watercolor

  • Prompting

    Together AI API Access:
    • Access Gemini 3.1 Flash Image via Together AI APIs using the endpoint google/flash-image-3.1
    • Authenticate using your Together AI API key in request headers
    • Control output dimensions with height/width parameters (default 1024x1024, up to 4K)
    • Use reference_images for text+image conversational editing workflows
    • Supports up to 4 outputs per request

  • Applications & use cases

    Marketing & Design:
    • Marketing mockups and brand assets with 94% text rendering accuracy
    • Posters, greeting cards, and social media content with precise typography
    • Product photography with natural lighting and photorealistic quality

    Infographics & Data Visualization:
    • Data-driven infographics grounded in Gemini's real-world knowledge
    • Diagrams and visual explanations with accurate text labels
    • Educational materials with factually grounded visual content

    Creative & Editorial:
    • Multi-character scenes with consistent identity across up to 5 characters
    • Complex compositions with up to 14 distinct objects
    • Creative styles including anime, concept art, illustration, and watercolor
    • Conversational editing via reference images for iterative refinement

Related models
  • Model provider
    Google
  • Type
    Image
  • Main use cases
    Image Generation
  • Resolution/Duration
    512px to 4096x4096
  • Deployment
    Serverless
  • Price

    $0.05 / image

  • Input modalities
    Text
    Image
  • Output modalities
    Image
  • Category
    Image