Example inputhover

prompt: "A complete RPG item icon sheet, 100 different items arranged in a clean grid layout, pixel art style, highly detailed game UI icons, consistent lighting and perspective, white background. Items include: swords, shields, armor sets, bows, crossbows, magic staffs, wands, potions, scrolls, spellbooks, rings, amulets, helmets, keys, gems, runes, ores and fantasy materials. Each item is unique, colorful, readable at small size, 2D top-down game asset style, fantasy RPG theme, polished pixel art, sharp outlines, vibrant colors, professional game asset pack. Uniform spacing, square tiles, icon atlas layout."
quality: "high"
image_size: "square"
num_images: 1
output_format: "png"

GPT Image v2 API

Array·gpt-image·by OpenAI

GPT Image 2 produces higher-fidelity images with stronger prompt understanding, improved compositional consistency, more physically accurate lighting, and enhanced fine-detail rendering.

Try it now →

API reference

Runtime (p50): 2m
Estimated price: From $0.053

Call the API

prediction.sh

curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "gpt-image-v2-text-to-image",
    "version": "0.0.1",
    "input": {
        "prompt": "A complete RPG item icon sheet, 100 different items arranged in a clean grid layout, pixel art style, highly detailed game UI icons, consistent lighting and perspective, white background.\n\nItems include: swords, shields, armor sets, bows, crossbows, magic staffs, wands, potions, scrolls, spellbooks, rings, amulets, helmets, keys, gems, runes, ores and fantasy materials.\n\nEach item is unique, colorful, readable at small size, 2D top-down game asset style, fantasy RPG theme, polished pixel art, sharp outlines, vibrant colors, professional game asset pack.\n\nUniform spacing, square tiles, icon atlas layout.",
        "quality": "high",
        "image_size": "square",
        "num_images": 1,
        "output_format": "png"
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/

Documentation8 sections

Overview
GPT Image | v2 | Text to Image Overview

GPT Image | v2 | Text to Image, from OpenAI's gpt-image family, transforms text prompts into high-fidelity images with exceptional photorealism and precise text rendering. This text-to-image model solves the challenge of generating visually accurate content for marketing, design, and product visualization, where traditional AI often struggles with legible text and realistic details. Its primary differentiator is a quality-first architecture that delivers pixel-perfect text in dense paragraphs, multilingual layouts, and infographics, alongside brand-consistent product photography with accurate labels and logos. Available via APIs like those on each::labs, GPT Image | v2 | Text to Image sets a new standard for OpenAI text-to-image generation, prioritizing fidelity over speed for professional outputs.
Capabilities
Capabilities
- State-of-the-art photorealism with accurate lighting, skin textures, materials, and environmental details.
- Pixel-perfect text rendering for dense paragraphs, small lettering, multilingual layouts, infographics, and UI mockups.
- Brand-consistent product photography, including precise logos, labels, color palettes, and packaging text.
- Precision image editing via natural language, maintaining context without artifacts.
- High-fidelity outputs in 4K resolution with 90-95% text accuracy.
- Style control: vivid hyper-real or natural rendering, with quality levels from low to high.
- Compositional consistency and character/style persistence across multiple images.
- Fast generation integrated natively in GPT architecture, 4x speed over prior versions.
Use cases
Use Cases for GPT Image | v2 | Text to Image

Marketing Teams: Create brand-consistent product photography. Prompt: "Photorealistic shot of wireless earbuds in premium packaging, logo 'each::labs' and specs list visible, high quality." Leverages accurate text on labels for ads.

Designers: Generate UI mockups and infographics. Prompt: "Clean UI screenshot for eachlabs.ai dashboard, with headings 'GPT Image | v2 | Text to Image' and bullet points on features, natural style, 1024x1024." Excels in legible, complex text layouts.

Developers: Prototype app visuals via GPT Image | v2 | Text to Image API. Integrate for dynamic image generation in tools, using editing for iterations on photorealistic scenes.

Content Creators: Produce realistic screenshots or visuals. Prompt: "Hyper-real image of a coffee shop scene with signage 'OpenAI text-to-image powered by each::labs', vivid lighting." Ensures photorealism for videos or social media.
Tips & tricks
Tips and Tricks

For best results with GPT Image | v2 | Text to Image, craft prompts with specific details on lighting, materials, and text elements to leverage its photorealism. Use style modifiers like "vivid" for dramatic effects or "natural" for realism, and specify quality as "high" for fine details. Optimize workflows by iterating with variations: generate a base image, then edit via natural language. Include exact text for labels or signage to ensure pixel-perfect rendering.

Example prompts:
- "A photorealistic product shot of a blue smartphone on a marble table, with label reading 'each::labs AI Model' in clean sans-serif font, natural lighting."
- "Infographic showing AI benchmarks: GPT Image | v2 | Text to Image at 95% text accuracy, dense paragraphs, high fidelity, vivid style."
- "Brand-consistent packaging for coffee beans, accurate logo and ingredient list in small legible text, 1024x1792 portrait."
Combine with region-based descriptions for compositional control, enhancing consistency across generations.
Technical spec
Technical Specifications
- Resolution Support: Native up to 4K, with options like 256x256, 512x512, 1024x1024, 1792x1024, and 1024x1792 for flexible outputs.
- Aspect Ratios: Supports standard ratios including square, landscape (e.g., 1792x1024), and portrait (e.g., 1024x1792).
- Input/Output Formats: Text prompts as input; outputs base64-encoded images; supports image editing with base64 input images.
- Quality Levels: Enum options: low, medium, high, auto; styles include vivid (hyper-real) and natural.
- Processing Time: 4x faster than GPT Image 1, with low-latency inference via optimized APIs.
- Architecture: Integrated into GPT-5 neural network for native image generation, emphasizing photorealism and text accuracy.
Things to be aware of
Things to Be Aware Of

GPT Image | v2 | Text to Image may underperform with non-Latin scripts like Chinese or Arabic, where text rendering is less reliable despite improvements. Common mistakes include vague prompts lacking style or quality specs, leading to inconsistent compositions. Edge cases like highly complex scenes with many elements can occasionally show minor artifacts in fine details. Resource needs are low via cloud APIs on each::labs—no local GPUs required—but high-volume use benefits from auto-scaling. Test iterations for character consistency in series generations.
Key considerations
Key Considerations

Before using GPT Image | v2 | Text to Image, ensure access via an API key on platforms like each::labs for seamless integration. It excels in scenarios requiring high-fidelity photorealism and text accuracy, outperforming speed-focused alternatives for commercial product shots or infographics. Users should note its quality-first design may involve slightly higher costs for premium outputs, balanced by 20% lower costs than predecessors in some setups. Ideal for developers and creators needing precise control; check provider terms for commercial use rights. Prerequisites include a clear, detailed text prompt for optimal results.
Limitations
Limitations
GPT Image | v2 | Text to Image prioritizes quality, potentially slower than speed-optimized models for bulk tasks. Text rendering remains unreliable for CJK, Arabic, Hebrew, and some scripts. No native video generation or advanced spatial controls like bounding boxes confirmed yet. Outputs capped by specified resolutions; extreme aspect ratios may distort. Commercial use allowed via APIs, but adhere to provider terms. Cannot perfectly replicate proprietary styles without reference training.
---

Related models

4 models

Krea 2 Medium · Text to Image AI model preview

Krea 2 Medium · Text to ImageKrea

Luma Uni-1 Max · Text to Image AI model preview

Luma Uni-1 Max · Text to ImageLuma

Krea 2 Large · Text to Image AI model preview

Krea 2 Large · Text to ImageKrea

Bytedance Seedream v5 Pro · Text to Image AI model preview

Bytedance Seedream v5 Pro · Text to ImageBytedance

GPT Image v2 API

GPT Image | v2 | Text to Image Overview

Capabilities

Use Cases for GPT Image | v2 | Text to Image

Tips and Tricks

Technical Specifications

Things to Be Aware Of

Key Considerations

Limitations

Related models