each::sense is in private beta.
Eachlabs | AI Workflows for app builders
nano-banana-pro

NANO-BANANA

Nano Banana Pro generates high quality images from text with sharp details, smooth rendering and impressively accurate visual output

Avg Run Time: 0.000s

Model Slug: nano-banana-pro

Playground

Input

Output

Example Result

Preview and download your result.

nano-banana-pro
Unsupported conditions - pricing not available for this input format

API & SDK

Create a Prediction

Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.

Get Prediction Result

Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.

Readme

Table of Contents
Overview
Technical Specifications
Key Considerations
Tips & Tricks
Capabilities
What Can I Use It For?
Things to Be Aware Of
Limitations

Overview

Nano Banana Pro is an advanced AI image generation model developed by Google, leveraging the Gemini 3 Pro Image API. It is designed to produce high-quality, photorealistic images from text prompts with exceptional detail, smooth rendering, and accurate visual fidelity. The model is optimized for professional workflows, supporting complex instructions, multi-image fusion, and iterative refinement, making it suitable for both creative and commercial applications. Nano Banana Pro is notable for its integration of real-world knowledge through Google Search, enabling contextually grounded image generation and improved prompt understanding.

Key features include native 2K resolution generation with optional 4K upscaling, advanced text rendering for legible typography, and robust character identity preservation across edits. The model architecture combines a large language model (Gemini 3 Pro) for deep reasoning with a high-fidelity diffusion engine (GemPix 2), allowing for multi-stage planning, evaluation, and improvement of generated images. This fusion enables superior consistency, especially in multi-image scenarios and long editing sessions. Nano Banana Pro stands out for its ability to interpret technical photographic terms, maintain stylistic coherence, and deliver rapid, stable outputs even for complex prompts.

Technical Specifications

  • Architecture: Gemini 3 Pro (LLM) + GemPix 2 (diffusion engine)
  • Parameters: Not publicly disclosed, but based on Gemini 3 Pro architecture
  • Resolution: Native 2K (2048x2048), optional 4K upscaling
  • Input/Output formats: Text prompts, image files (common formats like PNG, JPEG); outputs in PNG, JPEG
  • Performance metrics: ~10 seconds per full-resolution image, high batch stability, improved consistency across multiple generations

Key Considerations

  • The model excels with clear, detailed prompts and benefits from iterative refinement for optimal results
  • Best practices include specifying technical terms (e.g., lens types, lighting cues) and using multi-turn conversations for complex edits
  • Quality vs speed: While generation is fast, higher resolutions and complex prompts may require more processing time
  • Prompt engineering tips: Use precise language, reference real-world objects or styles, and leverage Google Search grounding for contextually accurate outputs
  • Avoid overly abstract or ambiguous prompts, as these can lead to inconsistent or less accurate results

Tips & Tricks

  • For optimal results, structure prompts with specific details about composition, style, and technical requirements
  • Use iterative refinement by making small adjustments to prompts and reviewing outputs in multiple turns
  • To achieve consistent character identity, reference the same subject or style across prompts and edits
  • For professional use, leverage the model's ability to generate legible text in images by specifying font styles and placement
  • Experiment with multi-image fusion for complex scenes, ensuring prompts clearly describe the desired integration

Capabilities

  • Generates high-quality images with sharp details and smooth rendering
  • Supports native 2K resolution and optional 4K upscaling for print and professional use
  • Excels at multi-image fusion and character identity preservation
  • Advanced text rendering for legible typography in posters, UI mockups, and marketing assets
  • Superior prompt understanding, especially for technical photographic terms and brand-specific colors
  • Real-time processing with stable outputs across batches
  • Contextual grounding using Google Search for fact-based image generation

What Can I Use It For?

  • Professional content creation for social media, marketing, and storytelling
  • Creative projects such as concept art, character design, and scene composition
  • Business use cases including logo design, infographic creation, and product visualization
  • Personal projects like photo editing, meme generation, and digital art
  • Industry-specific applications in advertising, education, and entertainment

Things to Be Aware Of

  • Experimental features like multi-stage planning and reasoning-aware upscaling may require careful prompt structuring
  • Some users report occasional inconsistencies in micro-textures or lighting gradients, especially at higher resolutions
  • Performance can vary based on prompt complexity, with more detailed prompts requiring longer processing times
  • Resource requirements are higher for 4K generation and multi-image fusion, which may impact workflow efficiency
  • Consistency is generally strong but can degrade over very long editing sessions or with highly abstract prompts
  • Positive feedback highlights the model's speed, accuracy, and ability to handle complex instructions
  • Common concerns include occasional text rendering issues for longer phrases and minor artifacts in upscaling

Limitations

  • Primary technical constraints include occasional artifacts in micro-textures and lighting gradients at higher resolutions
  • May not be optimal for highly abstract or ambiguous prompts, where outputs can be less consistent or accurate

Pricing

Pricing Type: Dynamic

1k resolution 1 images 0.15$

Conditions

SequenceNum ImagesResolutionPrice
1"1""1K"$0.15
2"2""1K"$0.3
3"3""1K"$0.45
4"4""1K"$0.6
5"1""2K"$0.15
6"2""2K"$0.3
7"3""2K"$0.45
8"4""2K"$0.6
9"1""4K"$0.3
10"2""4K"$0.6
11"3""4K"$0.9
12"4""4K"$1.2