Photomaker - Image Generation image preview

inference · 11.9s

Example inputhover

prompt: "A photo of a scientist img receiving the Nobel Prize"
num_steps: 50
style_name: "Photographic (Default)"
input_image
num_outputs: 1
guidance_scale: 5
negative_prompt: "nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry"
style_strength_ratio: 20

Photomaker - Image Generation

Array·photomaker·by Tencent

Create photos, paintings and avatars for anyone in any style within seconds.

Try it now →

API reference

Runtime (p50): 1m
Estimated price: $0.00108 / sec

Call the API

prediction.sh

curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "photomaker",
    "version": "0.0.1",
    "input": {
        "prompt": "A photo of a scientist img receiving the Nobel Prize",
        "num_steps": 50,
        "style_name": "Photographic (Default)",
        "input_image": "https://cdn.eachlabs.ai/ipfs/KFkSv1oX0v3e7GnOrmzULGqCA8222pC6FI2EKcfuCZWxvHN3/newton_0.jpg",
        "num_outputs": 1,
        "guidance_scale": 5,
        "negative_prompt": "nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry",
        "style_strength_ratio": 20
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/

Documentation8 sections

Overview
photomaker — Text-to-Image AI Model

photomaker from Tencent empowers creators to generate photorealistic photos, paintings, and avatars in any style within seconds, solving the challenge of rapid, high-fidelity visual content creation for personal and professional use. Developed by Tencent as part of the photomaker family, this text-to-image AI model leverages a powerful native multimodal architecture with 80B parameters (13B active MoE) for ultra-high quality outputs that rival premium models like FLUX. Ideal for users searching for Tencent text-to-image solutions, photomaker delivers consistent style adherence and detail from simple text prompts, making it a go-to for AI image generator API integrations.
Capabilities
Portrait Enhancement:
- Automatically improves skin texture, lighting, and overall clarity.
Style Transfer:
- Transforms photos into unique artistic styles such as Cinematic, Disney Character, Digital Art, Photographic, Fantasy Art, Neonpunk, Enhance, Comic Book, Lowpoly, and Line Art.
Creative Transformations:
- Modify images for creative use in marketing, social media, or personal projects.
Background Adjustments:
- Seamlessly edit or replace backgrounds while preserving subject integrity.
- Combine up to four input images for enhanced results.
- Generate images using textual prompts with adjustable guidance.
- Apply prebuilt or custom styles to create unique visuals.
- Produce multiple output images in a single execution.
- Fine-tune image details with parameters like style_strength_ratio and guidance_scale.
Use cases
Use Cases for photomaker

For content creators and designers, photomaker excels in avatar generation: input a text prompt like "professional headshot of a young entrepreneur in cyberpunk style, neon lights, detailed face" paired with a reference selfie, and receive a stylized yet identity-consistent portrait ready for social media or NFTs in seconds—leveraging its multimodal precision for unmatched consistency.

Marketers building e-commerce visuals use photomaker to create product renders; describe "laptop on a minimalist desk with tropical sunset lighting, photorealistic" to produce high-res assets that enhance listings without photoshoots, capitalizing on its ultra-high quality for realistic textures and lighting.

Developers integrating Tencent text-to-image into apps turn to photomaker for scalable backends; its efficient MoE architecture supports batch processing of user prompts for personalized graphics, ideal for platforms needing fast, diverse outputs like gaming avatars or ad customizers.

Artists exploring styles benefit from photomaker's diversity: generate variations of "oil painting of a mountain landscape in Van Gogh style" to inspire workflows, with the model's strength in artistic fidelity streamlining concept iteration for professional portfolios.
Tips & tricks
How to Use photomaker on Eachlabs

Access photomaker seamlessly through Eachlabs' Playground for instant testing, API for production-scale photomaker API calls, or SDK for custom integrations. Provide a descriptive text prompt, optional reference images for style or identity guidance, and select resolutions up to 4K; expect high-quality PNG/JPG outputs in seconds with excellent detail and style adherence. Eachlabs makes deploying this Tencent powerhouse straightforward for any workflow.
---
Technical spec
What Sets photomaker Apart

photomaker stands out in the crowded text-to-image AI model landscape through its Tencent-engineered multimodal backbone, enabling superior image detail and diversity compared to standard diffusion models. This 80B parameter model (with 13B active MoE) processes prompts efficiently for ultra-high quality results, supporting resolutions up to 4K while maintaining fast inference times suitable for production workflows.
- Native multimodal integration: Combines text and potential image conditioning for precise style transfer, allowing seamless avatar and painting generation that preserves identity across varied artistic renders—enabling developers to build robust photomaker API apps for custom visuals.
- Ultra-high fidelity with MoE efficiency: The mixture-of-experts design activates only necessary parameters for detailed outputs in seconds, outperforming smaller models in visual quality and prompt adherence without excessive compute—perfect for high-volume text-to-image generation.
- Versatile style and resolution support: Handles photorealistic photos to abstract paintings at high resolutions (512x512 to 4K), with strong performance in diverse outputs from identical prompts, setting it apart for creators needing reliable variety.
Things to be aware of
Portrait Glow-Up:
- Upload a casual portrait and enhance it to look studio-quality.
Stylized Scenes:
- Turn a photo of your cityscape into an oil painting or a cyberpunk-style visual.
Background Swaps:
- Replace a cluttered background with a minimalistic or abstract one.
Vintage Effects with Photomaker:
- Apply sepia tones and grainy textures to give images a nostalgic look.
- Use prompts like "sunset over a lake in a painterly style" with different guidance_scale values to see creative variations.
- Combine an input image with a textual prompt for style transfer (e.g., turn a photo into an oil painting).
- Experiment with multiple input images to create unique blends.
- Try disabling the safety checker to explore unrestricted outputs.
Key considerations
Ethical Usage:
Avoid using the model for deceptive purposes or creating harmful content.
Processing Limitations:
Batch processing may not be supported. Check if manual input is required for each image.
Privacy Concerns:
Uploaded images might be stored temporarily. Refer to the privacy policy before using sensitive content.
Complete necessary data preprocessing steps before using the Photomaker - Image Generation, such as resizing or normalizing images.
Ensure your system meets the minimum requirements for running the Photomaker - Image Generation efficiently.
Verify that input data formats (.jpg, .png,.jpeg,.webp) and structures are correct to avoid errors during execution.
A word 'img' needs to be used in the prompt as well.It is necessary to use the word 'img' only once in the prompt.
Limitations
Detail Loss:
- Overly stylized outputs might lose finer details, especially in complex backgrounds.
Restricted Outputs:
- Current model versions may not support 3D transformations or animations.
Maximum input image size: 1024x1024 pixels (larger images may need to be resized).
Known limitations include slight inaccuracies in style blending for highly complex input prompts.

Output Format: PNG