NANO-BANANA
Nano Banana Pro Edit generates refined image to image transformations, producing ultra high quality outputs guided by your prompt.
Avg Run Time: 85.000s
Model Slug: nano-banana-pro-edit
Playground
Input
Output
Example Result
Preview and download your result.

API & SDK
Create a Prediction
Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.
Get Prediction Result
Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.
Readme
Overview
Nano Banana Pro Edit is an advanced image-to-image transformation model developed by Google DeepMind, leveraging the Gemini 3 Pro architecture. The model is designed to generate ultra high-quality, refined outputs based on user prompts, offering a level of creative control and precision that supports both professional and creative workflows. It excels at understanding complex instructions, enabling users to perform sophisticated edits such as adjusting camera angles, changing lighting conditions, and applying nuanced color grading.
Key features include the ability to render accurate, legible text in multiple languages, blend multiple images while maintaining subject consistency, and connect to real-world knowledge for context-rich visualizations. The model stands out for its integration of advanced reasoning capabilities, allowing it to interpret and execute complex, multi-step editing tasks that go beyond traditional image generation. Its unique strengths lie in its studio-quality output, fine-grained control over image attributes, and adaptability to a wide range of creative and professional applications.
Technical Specifications
- Architecture: Gemini 3 Pro (Google DeepMind)
- Parameters: Not publicly disclosed
- Resolution: Supports up to 4K output; various aspect ratios including 16:9
- Input/Output formats: Accepts image and text prompts; outputs high-resolution images (common formats include PNG and JPEG)
- Performance metrics: Not explicitly published, but user feedback highlights high fidelity in text rendering, image consistency, and prompt adherence
Key Considerations
- The model performs best with clear, detailed prompts that specify desired edits or transformations
- For optimal results, use high-quality input images and provide explicit instructions regarding style, lighting, or composition
- Overly complex or ambiguous prompts may lead to less predictable results or visual artifacts
- There is a trade-off between output quality and generation speed, especially at higher resolutions
- Prompt engineering is crucial; iterative refinement and specificity improve output consistency
- Masked editing and major scene changes (e.g., day to night) may sometimes produce unnatural results or artifacts
Tips & Tricks
- Use detailed, structured prompts to guide the model toward specific visual outcomes (e.g., "Change lighting from day to night with soft shadows and warm tones")
- When editing text within images, specify font style, size, and language for best results
- For blending multiple images, ensure subjects are clearly described and provide reference images if possible
- Iteratively refine prompts based on initial outputs; small adjustments can significantly improve results
- To maintain character or subject consistency across edits, reference previous outputs or provide consistent descriptors
- For advanced edits, combine masked editing with explicit instructions about the area to modify and the desired effect
Capabilities
- Generates and edits images with studio-quality precision and control
- Renders accurate, legible text in multiple languages, suitable for posters, infographics, and diagrams
- Performs complex image transformations, including lighting changes, camera angle adjustments, and color grading
- Blends multiple images into cohesive compositions while maintaining subject consistency
- Leverages real-world knowledge for context-rich visualizations and data-driven graphics
- Supports high-resolution outputs up to 4K and various aspect ratios
- Excels at prompt adherence and nuanced creative tasks
What Can I Use It For?
- Creating professional marketing materials, posters, and infographics with accurate embedded text
- Designing educational content and visualizations that require precise data representation
- Mocking up product prototypes and lifestyle scenes for commercial use
- Generating creative artwork, surreal compositions, and concept art for personal or professional projects
- Visualizing complex data or scenarios, such as workout routines or historical scenes, based on user-provided information
- Supporting advanced creative workflows in design, advertising, and media production
Things to Be Aware Of
- Some advanced features, such as masked editing or major lighting changes, may occasionally produce unnatural results or visual artifacts
- The model may struggle with fine details, small faces, or perfect spelling in rendered text, especially in intricate scenes
- Performance and output consistency can vary depending on prompt complexity and input image quality
- High-resolution outputs require significant computational resources and may increase generation time
- Users report strong satisfaction with the model’s reasoning abilities and creative control, particularly for professional-grade outputs
- Common concerns include occasional inconsistencies in subject rendering and the need for prompt refinement to achieve optimal results
- Extensive filtering and data labeling are used to minimize harmful content, but users should remain vigilant for edge cases
Limitations
- May not consistently render fine details, small faces, or perfect text in complex images
- Can produce visual artifacts or unnatural results with highly complex edits or ambiguous prompts
- Resource-intensive at high resolutions, potentially limiting usability on lower-end hardware
Pricing
Pricing Type: Dynamic
1k resolution 1 images 0.15$
Conditions
| Sequence | Num Images | Resolution | Price |
|---|---|---|---|
| 1 | "1" | "1K" | $0.15 |
| 2 | "2" | "1K" | $0.3 |
| 3 | "3" | "1K" | $0.45 |
| 4 | "4" | "1K" | $0.6 |
| 5 | "1" | "2K" | $0.15 |
| 6 | "2" | "2K" | $0.3 |
| 7 | "3" | "2K" | $0.45 |
| 8 | "4" | "2K" | $0.6 |
| 9 | "1" | "4K" | $0.3 |
| 10 | "2" | "4K" | $0.6 |
| 11 | "3" | "4K" | $0.9 |
| 12 | "4" | "4K" | $1.2 |
Related AI Models
You can seamlessly integrate advanced AI capabilities into your applications without the hassle of managing complex infrastructure.
