each::sense is in private beta.
Eachlabs | AI Workflows for app builders
nano-banana-pro-edit

NANO-BANANA

Nano Banana Pro Edit generates refined image to image transformations, producing ultra high quality outputs guided by your prompt.

Avg Run Time: 85.000s

Model Slug: nano-banana-pro-edit

Playground

Input

Output

Example Result

Preview and download your result.

nano-banana-pro-edit
Unsupported conditions - pricing not available for this input format

API & SDK

Create a Prediction

Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.

Get Prediction Result

Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.

Readme

Table of Contents
Overview
Technical Specifications
Key Considerations
Tips & Tricks
Capabilities
What Can I Use It For?
Things to Be Aware Of
Limitations

Overview

Nano Banana Pro Edit is an advanced image-to-image transformation model developed by Google DeepMind, leveraging the Gemini 3 Pro architecture. The model is designed to generate ultra high-quality, refined outputs based on user prompts, offering a level of creative control and precision that supports both professional and creative workflows. It excels at understanding complex instructions, enabling users to perform sophisticated edits such as adjusting camera angles, changing lighting conditions, and applying nuanced color grading.

Key features include the ability to render accurate, legible text in multiple languages, blend multiple images while maintaining subject consistency, and connect to real-world knowledge for context-rich visualizations. The model stands out for its integration of advanced reasoning capabilities, allowing it to interpret and execute complex, multi-step editing tasks that go beyond traditional image generation. Its unique strengths lie in its studio-quality output, fine-grained control over image attributes, and adaptability to a wide range of creative and professional applications.

Technical Specifications

  • Architecture: Gemini 3 Pro (Google DeepMind)
  • Parameters: Not publicly disclosed
  • Resolution: Supports up to 4K output; various aspect ratios including 16:9
  • Input/Output formats: Accepts image and text prompts; outputs high-resolution images (common formats include PNG and JPEG)
  • Performance metrics: Not explicitly published, but user feedback highlights high fidelity in text rendering, image consistency, and prompt adherence

Key Considerations

  • The model performs best with clear, detailed prompts that specify desired edits or transformations
  • For optimal results, use high-quality input images and provide explicit instructions regarding style, lighting, or composition
  • Overly complex or ambiguous prompts may lead to less predictable results or visual artifacts
  • There is a trade-off between output quality and generation speed, especially at higher resolutions
  • Prompt engineering is crucial; iterative refinement and specificity improve output consistency
  • Masked editing and major scene changes (e.g., day to night) may sometimes produce unnatural results or artifacts

Tips & Tricks

  • Use detailed, structured prompts to guide the model toward specific visual outcomes (e.g., "Change lighting from day to night with soft shadows and warm tones")
  • When editing text within images, specify font style, size, and language for best results
  • For blending multiple images, ensure subjects are clearly described and provide reference images if possible
  • Iteratively refine prompts based on initial outputs; small adjustments can significantly improve results
  • To maintain character or subject consistency across edits, reference previous outputs or provide consistent descriptors
  • For advanced edits, combine masked editing with explicit instructions about the area to modify and the desired effect

Capabilities

  • Generates and edits images with studio-quality precision and control
  • Renders accurate, legible text in multiple languages, suitable for posters, infographics, and diagrams
  • Performs complex image transformations, including lighting changes, camera angle adjustments, and color grading
  • Blends multiple images into cohesive compositions while maintaining subject consistency
  • Leverages real-world knowledge for context-rich visualizations and data-driven graphics
  • Supports high-resolution outputs up to 4K and various aspect ratios
  • Excels at prompt adherence and nuanced creative tasks

What Can I Use It For?

  • Creating professional marketing materials, posters, and infographics with accurate embedded text
  • Designing educational content and visualizations that require precise data representation
  • Mocking up product prototypes and lifestyle scenes for commercial use
  • Generating creative artwork, surreal compositions, and concept art for personal or professional projects
  • Visualizing complex data or scenarios, such as workout routines or historical scenes, based on user-provided information
  • Supporting advanced creative workflows in design, advertising, and media production

Things to Be Aware Of

  • Some advanced features, such as masked editing or major lighting changes, may occasionally produce unnatural results or visual artifacts
  • The model may struggle with fine details, small faces, or perfect spelling in rendered text, especially in intricate scenes
  • Performance and output consistency can vary depending on prompt complexity and input image quality
  • High-resolution outputs require significant computational resources and may increase generation time
  • Users report strong satisfaction with the model’s reasoning abilities and creative control, particularly for professional-grade outputs
  • Common concerns include occasional inconsistencies in subject rendering and the need for prompt refinement to achieve optimal results
  • Extensive filtering and data labeling are used to minimize harmful content, but users should remain vigilant for edge cases

Limitations

  • May not consistently render fine details, small faces, or perfect text in complex images
  • Can produce visual artifacts or unnatural results with highly complex edits or ambiguous prompts
  • Resource-intensive at high resolutions, potentially limiting usability on lower-end hardware

Pricing

Pricing Type: Dynamic

1k resolution 1 images 0.15$

Conditions

SequenceNum ImagesResolutionPrice
1"1""1K"$0.15
2"2""1K"$0.3
3"3""1K"$0.45
4"4""1K"$0.6
5"1""2K"$0.15
6"2""2K"$0.3
7"3""2K"$0.45
8"4""2K"$0.6
9"1""4K"$0.3
10"2""4K"$0.6
11"3""4K"$0.9
12"4""4K"$1.2