Eachlabs | AI Workflows for app builders
photomaker

PHOTOMAKER

Create photos, paintings and avatars for anyone in any style within seconds.

Avg Run Time: 73.000s

Model Slug: photomaker

Playground

Input

Enter a URL or choose a file from your computer.

Enter a URL or choose a file from your computer.

Enter a URL or choose a file from your computer.

Enter a URL or choose a file from your computer.

Advanced Controls

Output

Example Result

Preview and download your result.

photomaker
The total cost depends on how long the model runs. It costs $0.001073 per second. Based on an average runtime of 73 seconds, each run costs about $0.0783. With a $1 budget, you can run the model around 12 times.

API & SDK

Create a Prediction

Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.

Get Prediction Result

Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.

Readme

Table of Contents
Overview
Technical Specifications
Key Considerations
Tips & Tricks
Capabilities
What Can I Use It For?
Things to Be Aware Of
Limitations

Overview

photomaker — Text-to-Image AI Model

photomaker from Tencent empowers creators to generate photorealistic photos, paintings, and avatars in any style within seconds, solving the challenge of rapid, high-fidelity visual content creation for personal and professional use. Developed by Tencent as part of the photomaker family, this text-to-image AI model leverages a powerful native multimodal architecture with 80B parameters (13B active MoE) for ultra-high quality outputs that rival premium models like FLUX. Ideal for users searching for Tencent text-to-image solutions, photomaker delivers consistent style adherence and detail from simple text prompts, making it a go-to for AI image generator API integrations.

Technical Specifications

What Sets photomaker Apart

photomaker stands out in the crowded text-to-image AI model landscape through its Tencent-engineered multimodal backbone, enabling superior image detail and diversity compared to standard diffusion models. This 80B parameter model (with 13B active MoE) processes prompts efficiently for ultra-high quality results, supporting resolutions up to 4K while maintaining fast inference times suitable for production workflows.

  • Native multimodal integration: Combines text and potential image conditioning for precise style transfer, allowing seamless avatar and painting generation that preserves identity across varied artistic renders—enabling developers to build robust photomaker API apps for custom visuals.
  • Ultra-high fidelity with MoE efficiency: The mixture-of-experts design activates only necessary parameters for detailed outputs in seconds, outperforming smaller models in visual quality and prompt adherence without excessive compute—perfect for high-volume text-to-image generation.
  • Versatile style and resolution support: Handles photorealistic photos to abstract paintings at high resolutions (512x512 to 4K), with strong performance in diverse outputs from identical prompts, setting it apart for creators needing reliable variety.

Key Considerations

Ethical Usage:

Avoid using the model for deceptive purposes or creating harmful content.

Processing Limitations:

Batch processing may not be supported. Check if manual input is required for each image.

Privacy Concerns:

Uploaded images might be stored temporarily. Refer to the privacy policy before using sensitive content.

Complete necessary data preprocessing steps before using the Photomaker - Image Generation, such as resizing or normalizing images.

Ensure your system meets the minimum requirements for running the Photomaker - Image Generation efficiently.

Verify that input data formats (.jpg, .png,.jpeg,.webp) and structures are correct to avoid errors during execution.

A word 'img' needs to be used in the prompt as well.It is necessary to use the word 'img' only once in the prompt.

Tips & Tricks

How to Use photomaker on Eachlabs

Access photomaker seamlessly through Eachlabs' Playground for instant testing, API for production-scale photomaker API calls, or SDK for custom integrations. Provide a descriptive text prompt, optional reference images for style or identity guidance, and select resolutions up to 4K; expect high-quality PNG/JPG outputs in seconds with excellent detail and style adherence. Eachlabs makes deploying this Tencent powerhouse straightforward for any workflow.

---

Capabilities

Portrait Enhancement:

  • Automatically improves skin texture, lighting, and overall clarity.

Style Transfer:

  • Transforms photos into unique artistic styles such as Cinematic, Disney Character, Digital Art, Photographic, Fantasy Art, Neonpunk, Enhance, Comic Book, Lowpoly, and Line Art.

Creative Transformations:

  • Modify images for creative use in marketing, social media, or personal projects.

Background Adjustments:

  • Seamlessly edit or replace backgrounds while preserving subject integrity.

  • Combine up to four input images for enhanced results.
  • Generate images using textual prompts with adjustable guidance.
  • Apply prebuilt or custom styles to create unique visuals.
  • Produce multiple output images in a single execution.
  • Fine-tune image details with parameters like style_strength_ratio and guidance_scale.

What Can I Use It For?

Use Cases for photomaker

For content creators and designers, photomaker excels in avatar generation: input a text prompt like "professional headshot of a young entrepreneur in cyberpunk style, neon lights, detailed face" paired with a reference selfie, and receive a stylized yet identity-consistent portrait ready for social media or NFTs in seconds—leveraging its multimodal precision for unmatched consistency.

Marketers building e-commerce visuals use photomaker to create product renders; describe "laptop on a minimalist desk with tropical sunset lighting, photorealistic" to produce high-res assets that enhance listings without photoshoots, capitalizing on its ultra-high quality for realistic textures and lighting.

Developers integrating Tencent text-to-image into apps turn to photomaker for scalable backends; its efficient MoE architecture supports batch processing of user prompts for personalized graphics, ideal for platforms needing fast, diverse outputs like gaming avatars or ad customizers.

Artists exploring styles benefit from photomaker's diversity: generate variations of "oil painting of a mountain landscape in Van Gogh style" to inspire workflows, with the model's strength in artistic fidelity streamlining concept iteration for professional portfolios.

Things to Be Aware Of

Portrait Glow-Up:

  • Upload a casual portrait and enhance it to look studio-quality.

Stylized Scenes:

  • Turn a photo of your cityscape into an oil painting or a cyberpunk-style visual.

Background Swaps:

  • Replace a cluttered background with a minimalistic or abstract one.

Vintage Effects with Photomaker:

  • Apply sepia tones and grainy textures to give images a nostalgic look.
  • Use prompts like "sunset over a lake in a painterly style" with different guidance_scale values to see creative variations.
  • Combine an input image with a textual prompt for style transfer (e.g., turn a photo into an oil painting).
  • Experiment with multiple input images to create unique blends.
  • Try disabling the safety checker to explore unrestricted outputs.

Limitations

Detail Loss:

  • Overly stylized outputs might lose finer details, especially in complex backgrounds.

Restricted Outputs:

  • Current model versions may not support 3D transformations or animations.

Maximum input image size: 1024x1024 pixels (larger images may need to be resized).

Known limitations include slight inaccuracies in style blending for highly complex input prompts.


Output Format: PNG

Pricing

Pricing Detail

This model runs at a cost of $0.001073 per second.

The average execution time is 73 seconds, but this may vary depending on your input data.

The average cost per run is $0.078293

Pricing Type: Execution Time

Cost Per Second means the total cost is calculated based on how long the model runs. Instead of paying a fixed fee per run, you are charged for every second the model is actively processing. This pricing method provides flexibility, especially for models with variable execution times, because you only pay for the actual time used.