NANO-BANANA-2

Nano Banana 2 delivers next-generation text-to-image generation, producing ultra high quality visuals with enhanced detail, realism, and prompt accuracy.

Avg Run Time: 50.000s

Model Slug: nano-banana-2-text-to-image

Release Date: February 26, 2026

Playground

Input

Prompt*

Number of Images

Aspect Ratio

Output Format

Resolution

Limit Generations

Output

Example Result

Preview and download your result.

Default: $0.08 per image (1K base rate). Cost per execution: $0.0800

API & SDK

Create a Prediction

Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.

Get Prediction Result

Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.

Readme

Table of Contents

Overview

Technical Specifications

Key Considerations

Tips & Tricks

Capabilities

What Can I Use It For?

Things to Be Aware Of

Limitations

Overview

Nano Banana 2 (Gemini 3.1 Flash Image) is Google's latest text-to-image generation model that combines professional-grade image quality with exceptional speed and efficiency. It solves the traditional tradeoff between visual fidelity and generation latency by delivering Pro-level capabilities at Flash-tier performance. This model brings advanced world knowledge, precise text rendering, and subject consistency to rapid image generation workflows, making high-quality visual creation accessible to developers, designers, and creative teams at scale.

Technical Specifications

Resolution Support: 1K, 2K, and 4K output resolutions
Aspect Ratios: Native support for 1:1,3:2,2:3,3:4,4:3,4:5,5:4,9:16,16:9,21:9
Input Format: Text prompts with optional image references for grounding and consistency
Output Format: PNG images with SynthID watermark
Subject Consistency: Maintains visual consistency for up to 5 characters and 14 objects in a single workflow
Processing Speed: Flash-tier latency optimized for rapid iteration and high-volume generation

Key Considerations

Nano Banana 2 excels in scenarios requiring both speed and quality, making it ideal for production workflows where iteration matters. The model performs best when you need accurate text rendering, consistent character depiction, or real-world grounding through web search integration. Consider using this model for marketing assets, social media content, and rapid prototyping where generation speed directly impacts productivity. The Flash-tier architecture means lower latency and cost-efficiency compared to Pro models, though for extremely specialized high-fidelity tasks requiring maximum reasoning, Nano Banana Pro remains available as an alternative.

Tips & Tricks

To maximize Nano Banana 2 | Text to Image quality, structure prompts with specific visual details and style references. The model responds well to descriptive language about lighting, texture, and composition. For text-heavy designs, explicitly specify font style and placement in your prompt. When maintaining character consistency across multiple images, reference the same character description or use the subject consistency feature with clear visual anchors. Leverage the configurable thinking levels—use High thinking mode for complex, multi-layered prompts that require precise instruction following. Example prompts: "A photorealistic portrait of a woman with auburn hair in soft golden hour lighting, professional photography style", "Marketing mockup for a tech startup with clean sans-serif typography, modern minimalist design, 16:9 aspect ratio", and "Storyboard panel showing two characters in a fantasy tavern, maintaining consistent character appearance across the scene".

Capabilities

Advanced Text Rendering: Generate accurate, legible text directly in images for marketing mockups, greeting cards, infographics, and UI designs
Real-World Grounding: Leverage Google Search integration to render specific subjects with enhanced accuracy based on real-time information and web images
Subject Consistency: Maintain visual coherence of multiple characters and objects across iterations for storyboarding and narrative building
Multi-Language Localization: Translate and localize text within images for international markets without regenerating entire compositions
High-Fidelity Visual Quality: Deliver vibrant lighting, rich textures, and sharp details at Flash-tier speed
Flexible Resolution and Aspect Ratio Control: Generate images from 512px to 4K with native support for 8 distinct aspect ratio categories
Precise Instruction Following: Adhere strictly to complex, multi-layered prompts with improved reasoning through configurable thinking levels

What Can I Use It For?

Marketing and Advertising: Create localized ad campaigns with accurate text rendering and visual consistency. Generate product mockups, social media assets, and promotional graphics at scale. Example: "Create a 16:9 banner for a summer sale campaign with bold typography, vibrant colors, and consistent brand imagery across multiple variations".

Content Creation and Storyboarding: Develop visual narratives for comics, animations, or video content while maintaining character consistency across scenes. The subject consistency feature ensures characters retain their appearance throughout multi-image workflows.

Data Visualization and Infographics: Transform notes and data into professional diagrams and infographics with precise text rendering. The model's world knowledge helps create contextually accurate visualizations for complex topics.

Developer Applications: Build dynamic UI generators, creative tools, and image-generation pipelines that require both speed and quality. The Nano Banana 2 | Text to Image API enables rapid-fire iterations with minimal latency, ideal for real-time creative applications.

Things to Be Aware Of

While Nano Banana 2 delivers impressive text rendering, extremely complex typography or stylized fonts may occasionally require prompt refinement. Subject consistency works best when character descriptions are detailed and consistent across prompts—vague references may result in appearance variations. The model includes SynthID watermarking on all outputs, which is important for content attribution and detection. Processing speed varies based on resolution and thinking level settings; 512px with minimal thinking provides fastest results, while 4K with high thinking requires more processing time. Real-world grounding through web search may occasionally surface outdated or irrelevant references, so verify critical factual content in generated images.

Limitations

Nano Banana 2 | Text to Image cannot generate images of real, named individuals with photorealistic accuracy due to safety guidelines. The model may struggle with extremely niche or obscure subject matter where web search provides limited reference material. Complex hand anatomy, intricate mechanical details, and certain abstract concepts may not render with perfect accuracy. Subject consistency is limited to 5 characters and 14 objects per workflow—larger compositions may experience consistency degradation. The model cannot edit existing images in the traditional sense; image-to-image workflows require regeneration rather than targeted modifications. Very long or convoluted prompts may exceed optimal reasoning capacity even with high thinking levels enabled.

AI TRENDS

Related AI Models

You can seamlessly integrate advanced AI capabilities into your applications without the hassle of managing complex infrastructure.

Text to Image

Wan 2.6 Text-to-Image is a model that generates high-quality images from text prompts with consistent visual results.

Wan | v2.6 | Text to Image

40 s

Text to Image

Text-to-image generation with FLUX.2. Ultra-sharp realism, precise prompt interpretation, and seamless native editing for full creative control.

Flux 2 | Flex

20 s

Text to Image

Flux 2 [klein] 4B from Black Forest Labs delivers text-to-image generation with enhanced realism, sharper text rendering, and integrated native editing tools.

Flux 2 | Klein | 4B | Base | Text to Image

7 s

Text to Image

FLUX.2 [dev] from Black Forest Labs enables fast text-to-image generation with enhanced realism, sharper text rendering, and built-in native editing capabilities.

Flux 2 | Flash | Text to Image

7 s

Explore More