GPT-IMAGE
GPT Image 2 creates more advanced images with deeper prompt understanding, stronger compositional coherence, more realistic lighting, and richer fine-detail rendering.
Avg Run Time: 100.000s
Model Slug: gpt-image-v2-edit
Release Date: April 21, 2026
Playground
Input
Output
Example Result
Preview and download your result.

API & SDK
Create a Prediction
Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.
Get Prediction Result
Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.
Readme
Overview
GPT Image | v2 | Edit Overview
GPT Image | v2 | Edit, from OpenAI's GPT Image family, enables precise image-to-image transformations using natural language instructions, allowing users to modify existing images while preserving key details. This model solves the challenge of controlled editing in AI workflows, offering superior instruction-following compared to diffusion-based predecessors like DALL-E. Integrated natively into ChatGPT and the OpenAI API, it supports iterative refinement for professional applications such as design tweaks and content creation.
As part of the GPT Image series successor to DALL-E, GPT Image | v2 | Edit builds on GPT Image 1.5's foundation, emphasizing autoregressive architecture for advanced photorealism and edit precision. Developers access it via the GPT Image | v2 | Edit API or platforms like each::labs, streamlining OpenAI image-to-image tasks in automated pipelines. Its standout differentiator is maintaining image consistency during edits, generating results up to four times faster than prior versions.
Technical Specifications
Technical Specifications
- Resolution Support: 1024x1024 (1:1 square), 1536x1024 (3:2 landscape), 1024x1536 (2:3 portrait).
- Input/Output Formats: JPEG or PNG; supports base64 output or URL delivery via API.
- Image Editing Mode: Instruction-based image-to-image, using input image and text prompt for modifications.
- Processing Time: Up to 4x faster than GPT Image 1; async polling required for results, typically seconds to minutes.
- Quality Settings: Low, medium, high options for balancing speed and fidelity.
- Architecture: Autoregressive model, distinct from diffusion methods, enabling precise edits and multimodal integration.
API calls specify model as "openai/gpt-image-1.5" variants, with image inputs 20% cheaper than GPT Image 1.
Key Considerations
Key Considerations
Before using GPT Image | v2 | Edit, ensure access to an OpenAI API key or each::labs integration for seamless deployment. It excels in scenarios requiring precise, instruction-driven edits over full regenerations, ideal for workflows integrating text and vision models. Processing is asynchronous, so plan for polling status in production apps—expect variable times based on queue and quality settings.
Cost favors image I/O at reduced rates, but high-quality outputs demand more compute; balance with medium settings for efficiency. Best for users needing consistency in iterative edits versus one-shot generations from competitors. Prerequisites include a source image and descriptive prompt; test via ChatGPT for quick validation before API scaling.
Tips & Tricks
Tips and Tricks
For optimal results with GPT Image | v2 | Edit, craft prompts that reference specific image regions, e.g., "Replace the background with a sunset while keeping the subject's pose and lighting intact." This leverages its instruction-following strength. Use iterative workflows: generate a base, then refine with follow-up edits like "Enhance text readability on the sign without altering the overall composition."
Optimize parameters by selecting "high" quality for photorealism and "1536x1024" for landscapes; enable sync mode only for low-latency needs. Combine with GPT text models for dynamic prompt generation. Example prompts:
- "Edit this photo to add a cyberpunk cityscape behind the car, matching neon lighting."
- "Change the outfit to Victorian attire, preserve facial details and expression."
- "Insert product logo on the bottle label clearly, adjust shadows for realism."
Avoid vague instructions; specificity yields better detail retention.
Capabilities
Capabilities
- Precise instruction-based image editing, modifying elements while preserving unchanged details.
- Image-to-image transformations with natural language, supporting complex scene adjustments.
- Advanced photorealism and consistency across iterative edits.
- Multi-size output: square, landscape, portrait resolutions up to 1536x1024.
- Integration with GPT ecosystem for text-vision workflows and automated pipelines.
- Format flexibility: JPEG/PNG outputs, base64 or URL delivery.
- Quality tiers (low/medium/high) for speed-fidelity tradeoffs.
- Cost-efficient image I/O, 20% cheaper than prior versions.
What Can I Use It For?
Use Cases for GPT Image | v2 | Edit
Designers: Refine product mockups by editing labels and packaging for brand consistency, e.g., "Add logo to the bottle with realistic reflections, keep product shape intact." Leverages precise text rendering.
Marketers: Adapt campaign visuals iteratively, such as "Replace background to urban street, enhance text on banners for readability." Uses instruction-following for quick variations.
Developers: Build AI pipelines on each::labs, polling API for edited UI screenshots: "Update dashboard elements to dark mode, preserve data layout." Integrates multimodal capabilities.
Content Creators: Edit photos for social media, prompt "Change outfit to fantasy armor, match lighting and pose." Ensures photorealistic results with detail retention.
Things to Be Aware Of
Things to Be Aware Of
GPT Image | v2 | Edit may underperform on highly complex multi-object edits without region-specific prompts, leading to unintended changes. Async processing requires status polling; failures occur if prompts exceed token limits or queues peak. Common mistakes include vague instructions like "make it better," which yield inconsistent outputs—always specify changes explicitly.
Resource needs scale with quality: high settings increase latency and cost. Test edge cases like dense text or fine details in ChatGPT first. Multilingual support improves in v2 but verify for non-Latin scripts.
Limitations
Limitations
GPT Image | v2 | Edit cannot generate videos or audio; strictly image-to-image. Struggles with extreme aspect ratios beyond specified sizes or fully novel compositions without source images. Text rendering, while advanced, may falter in overly dense or artistic fonts. No real-time sync without polling; not suited for ultra-low latency apps. Input images must be compatible formats; rate limits apply via API.
Pricing
Pricing Type: Dynamic
gpt-image-2 edit: text in $5/M, image in $10/M, text out $40/M, image out $30/M. Note: gpt-image-2 always processes reference images at high fidelity, so input image tokens may be higher than other GPT image models.
Current Pricing
Related AI Models
You can seamlessly integrate advanced AI capabilities into your applications without the hassle of managing complex infrastructure.
