FLUX-2

Flux 2 [klein] 9B Base from Black Forest Labs supports precise image-to-image editing with natural-language instructions and hex color–based control.

Avg Run Time: 10.000s

Model Slug: flux-2-klein-9b-base-edit

Input

Prompt*

Image URLs*

Negative Prompt

Guidance Scale

Number of Inference Steps

Image Size

Number of Images

Acceleration

Enable Safety Checker

Output Format

Seed

Output

Example Result

Preview and download your result.

Your request will cost $0.002 per megapixel for output.

Table of Contents

Overview

Technical Specifications

Key Considerations

Tips & Tricks

Capabilities

What Can I Use It For?

Things to Be Aware Of

Limitations

Overview

flux-2-klein-9b-base-edit — Image Editing AI Model

Developed by Black Forest Labs as part of the flux-2 family, flux-2-klein-9b-base-edit is a compact yet powerful image-to-image editing model that transforms how creators refine and modify existing images. Unlike traditional image editors that require manual adjustments, this model accepts natural-language instructions to make precise edits—allowing you to describe exactly what you want changed and let the AI handle the execution. The model also supports hex color-based control, enabling pixel-perfect color adjustments alongside semantic editing tasks. For teams building AI image editor APIs or developers integrating image-to-image capabilities into applications, flux-2-klein-9b-base-edit delivers professional-grade results without the overhead of larger models.

At its core, flux-2-klein-9b-base-edit solves a critical problem: editing images intelligently without losing quality or coherence. Whether you're adjusting lighting, changing backgrounds, modifying objects, or applying color corrections, the model understands both your text instructions and the visual context of your source image to produce edits that feel natural and integrated.

Technical Specifications

What Sets flux-2-klein-9b-base-edit Apart

Unified text-to-image and image editing in one architecture. Most image generation platforms require separate models for creation and editing. flux-2-klein-9b-base-edit combines both capabilities in a single 9-billion parameter rectified flow transformer, eliminating the need to switch between tools and reducing latency for iterative workflows.

Multi-reference image editing for complex compositions. The model can incorporate visual elements from multiple reference images into a single coherent output, making it ideal for product mockups, architectural visualizations, and composite designs where you need to blend elements from different sources while maintaining visual consistency.

Hex color-based control for precise color grading. Beyond natural-language editing, flux-2-klein-9b-base-edit supports direct hex color input, enabling developers and designers to enforce brand colors, match color palettes, or apply exact color specifications without ambiguity—a capability rarely found in general-purpose image editing models.

Undistilled foundation model for maximum customization. The 9B Base variant preserves the complete training signal without step or guidance distillation, making it ideal for fine-tuning, LoRA training, and custom pipeline development. This flexibility allows researchers and advanced users to adapt the model for domain-specific applications like architectural rendering or product design mockups.

Technical specifications: Supports 1024×1024 resolution with capability for higher resolutions. Requires 24GB VRAM or more for optimal performance on consumer GPUs (RTX 4090 or professional systems). Inference runs at approximately 0.5–2 seconds on high-end hardware using 25–50 configurable sampling steps. FP8 and NVFP4 quantization options reduce VRAM requirements by up to 55% while maintaining quality, making deployment more accessible.

Key Considerations

The 9B Base model prioritizes maximum flexibility and output diversity over speed compared to distilled variants, making it better suited for applications requiring fine-tuning and custom control rather than real-time interactive use.

VRAM requirements of 24GB or more mean this variant is optimized for high-end consumer GPUs like RTX 4090 or professional hardware; users with lower VRAM should consider the 4B variants or distilled models.

The model demonstrates high resilience against violative inputs in complex generation and editing tasks, with safety fine-tuning and third-party evaluation completed prior to release.

Multi-reference image editing with the 9B Base model produces more coherent results across multiple reference images compared to smaller variants, though careful prompting and sometimes multiple renders may be necessary for optimal results.

The 25-50 step sampling schedule provides a configurable range, allowing users to balance between inference speed and output quality based on their specific requirements.

Prompt engineering for image editing should include detailed natural-language instructions specifying desired modifications; the model responds well to specific color descriptions and detailed editing directives.

The model requires correct text encoder configuration to avoid shape mismatch errors during inference; using the appropriate Qwen3 text embedder is critical for proper operation.

Tips & Tricks

How to Use flux-2-klein-9b-base-edit on Eachlabs

Access flux-2-klein-9b-base-edit through Eachlabs's Playground for instant testing or integrate it via API for production workflows. Provide a source image and a natural-language editing prompt (optionally with hex color codes for color-specific adjustments). The model outputs high-quality edited images at 1024×1024 resolution or higher. Eachlabs handles infrastructure scaling, so you can run inference without managing GPU resources—priced at $0.002 per megapixel for output, making it cost-effective for both small experiments and large-scale applications.

---END---

Capabilities

Unified text-to-image generation and image editing in a single model architecture, eliminating the need for separate specialized models.

High-quality photorealistic image generation with exceptional output diversity, particularly in base model variants that preserve complete training signal.

Precise image-to-image editing with natural-language instructions, allowing detailed modifications to existing images through descriptive prompts.

Multi-reference image generation and editing, enabling the model to incorporate elements from multiple reference images into a single coherent output.

Fine-tuning and LoRA training capabilities due to the undistilled, full-capacity foundation model architecture.

Flexible inference configuration with adjustable step counts (25-50 range) to balance quality and speed based on application requirements.

Support for quantized variants (FP8 and NVFP4) that maintain quality while reducing computational requirements and VRAM usage.

State-of-the-art quality on the Pareto frontier for quality versus latency and VRAM efficiency compared to other compact image models.

Robust safety features with demonstrated high resilience against violative inputs in complex generation and editing tasks.

What Can I Use It For?

Use Cases for flux-2-klein-9b-base-edit

E-commerce product visualization. Product teams can upload a photo of an item and use natural-language prompts to place it in different environments: "Show this chair in a modern minimalist living room with soft afternoon light streaming through large windows." The model generates photorealistic composites without requiring studio photography or manual Photoshop work, accelerating time-to-market for product listings and marketing materials.

Architectural and interior design iteration. Architects and designers can feed floor plan renderings or interior photos into the model and request specific modifications: "Change the wall color to warm sage green and add large-format windows on the north side." Multi-reference editing enables designers to blend design elements from multiple inspiration images into a single coherent visualization, streamlining the client approval process.

Developers building AI image editor APIs. Backend engineers integrating image-to-image capabilities into SaaS platforms can leverage flux-2-klein-9b-base-edit's unified architecture and fine-tuning flexibility to create custom editing workflows. The model's support for both text prompts and hex color parameters allows developers to build feature-rich editors that handle everything from semantic edits to precise color grading without maintaining separate model pipelines.

Content creators refining visual assets. Photographers and digital artists use the model to iteratively refine images during post-production. Instead of manually adjusting lighting, backgrounds, or object placement in Photoshop, creators describe the desired changes in plain language and receive edited versions in seconds, enabling rapid experimentation and faster creative iteration cycles.

Things to Be Aware Of

The 9B Base model is more VRAM-intensive than smaller variants, requiring 24GB or more of GPU memory for optimal performance; users with limited VRAM should consider the 4B variants or distilled models instead.

Multi-reference image editing can be inconsistent and may require multiple renders and careful prompt engineering to achieve desired coherence across reference images, particularly when attempting complex compositions.

The model uses 25-50 inference steps by default, resulting in longer generation times (several seconds on standard consumer hardware) compared to distilled variants that use only 4 steps; this makes it less suitable for real-time interactive applications.

Correct text encoder configuration is critical; using the wrong Qwen3 text embedder variant can result in shape mismatch errors that prevent inference.

The 9B Base model is released under the FLUX Non-Commercial License (NCL), restricting commercial use; users requiring commercial licensing should verify terms or consider alternative models.

Output quality and consistency can vary based on prompt specificity; vague or poorly structured prompts may result in less coherent editing results, particularly in multi-reference scenarios.

The model's flexibility and undistilled architecture make it more suitable for research and custom applications than for straightforward production use; users prioritizing speed and simplicity may benefit from distilled variants.

Fine-tuning and LoRA training require significant computational resources and expertise; these advanced capabilities are best suited for users with technical backgrounds and access to adequate hardware.

Limitations

The 9B Base model's 25-50 step inference requirement results in significantly longer generation times compared to distilled variants, making it less practical for real-time or interactive applications where sub-second latency is critical.

High VRAM requirements (24GB or more) limit accessibility to users with high-end consumer GPUs or professional hardware; this restricts deployment options compared to more efficient models.

Multi-reference image editing consistency remains a challenge, with the model sometimes producing incoherent results across multiple reference images even with careful prompting, requiring iterative refinement and multiple render attempts.

AI TRENDS

Related AI Models

You can seamlessly integrate advanced AI capabilities into your applications without the hassle of managing complex infrastructure.

Image to Image

GPT Image 1.5 creates highly detailed images with accurate prompt interpretation, maintaining consistent composition, realistic lighting, and refined visual detail.

GPT Image | v1.5 | Edit

40 s

Image to Image

A utility endpoint that crops images efficiently for workflow processing, enabling precise framing and clean image preparation.

Crop Image

10 s

Image to Image

Generates images from text combined with edge, depth, or pose inputs using custom LoRA and Tongyi-MAI’s ultra-fast 6B Z-Image Turbo model for fast, high-quality, and controllable image creation.

Z Image | Turbo | Controlnet | Lora

13 s

Image to Image

Creates a highly realistic hand-drawn pen sketch effect with natural line pressure, detailed cross-hatching, and authentic paper texture, while preserving the subject’s proportions and overall identity