Gemini 3.1 Flash Image (Nano Banana 2)
Pro-quality image generation at Flash speed with search grounding
About model
Gemini 3.1 Flash Image is Google's image generation model combining Pro-level quality with Flash-level speed. It delivers 94% text rendering accuracy, supports up to 5 consistent characters per scene, and handles complex compositions with up to 14 distinct objects. The model leverages Gemini's real-world knowledge base and real-time web search to generate accurate renderings of specific subjects, infographics, and data visualizations, with native 4K output and creative styles including anime, concept art, illustration, and watercolor.
94%
Precise typography for signs, labels, UI mockups, and logos
5
Consistent identity preservation across multi-character compositions
4K
From 512px to 4096x4096 with flexible aspect ratios
- Precision Text Rendering: 94% accuracy for text in images including signs, labels, UI mockups, logos, and marketing materials
- Search-Grounded Generation: Leverages Gemini's knowledge base and real-time web search for factually accurate renderings of specific subjects, infographics, and data visualizations
- Multi-Character & Object Fidelity: Up to 5 consistent characters per scene and 14 distinct objects per composition with identity preservation
- Creative Versatility: Near-photographic quality plus anime, concept art, illustration, and watercolor styles with native 4K output and conversational editing
API usage
Endpoint:
Model card
Architecture Overview:
• Based on Gemini 3 Flash with image generation and conversational editing capabilities
• Combines Pro-level quality with Flash-level speed for rapid iteration
• Native 4K output (up to 4096x4096) across flexible aspect ratios from 512px
• Search grounding using Gemini's real-world knowledge base and real-time web search
• Reference image input for conversational editing workflows
Training Methodology:
• Built on Gemini 3 Flash foundation with specialized image generation training
• Optimized for text rendering accuracy, character consistency, and multi-object composition
• Trained with Gemini's knowledge base for factually grounded image generation
• SynthID watermarking with C2PA Content Credentials for AI content provenance
Performance Characteristics:
• 94% text rendering accuracy for signs, labels, UI mockups, and logos
• Up to 5 consistent characters per scene with identity preservation
• Complex compositions with up to 14 distinct objects rendered accurately
• Near-photographic quality with natural lighting and depth of field
• Creative style support including anime, concept art, illustration, and watercolor
Prompting
Together AI API Access:
• Access Gemini 3.1 Flash Image via Together AI APIs using the endpoint google/flash-image-3.1
• Authenticate using your Together AI API key in request headers
• Control output dimensions with height/width parameters (default 1024x1024, up to 4K)
• Use reference_images for text+image conversational editing workflows
• Supports up to 4 outputs per request
Applications & use cases
Marketing & Design:
• Marketing mockups and brand assets with 94% text rendering accuracy
• Posters, greeting cards, and social media content with precise typography
• Product photography with natural lighting and photorealistic quality
Infographics & Data Visualization:
• Data-driven infographics grounded in Gemini's real-world knowledge
• Diagrams and visual explanations with accurate text labels
• Educational materials with factually grounded visual content
Creative & Editorial:
• Multi-character scenes with consistent identity across up to 5 characters
• Complex compositions with up to 14 distinct objects
• Creative styles including anime, concept art, illustration, and watercolor
• Conversational editing via reference images for iterative refinement
- TypeImage
- Main use casesImage Generation
- Resolution/Duration512px to 4096x4096
- DeploymentServerless
- Endpoint
- Price
$0.05 / image
- Input modalitiesTextImage
- Output modalitiesImage
- CategoryImage