
INSTANT-ID
Instant ID is making realistic images of real people instantly
Avg Run Time: 32.000s
Model Slug: instant-id
Playground
Input
Enter a URL or choose a file from your computer.
Invalid URL.
image/jpeg, image/png, image/jpg, image/webp (Max 50MB)
Enter a URL or choose a file from your computer.
Click to upload or drag and drop
(Max 50MB)
Output
Example Result
Preview and download your result.

API & SDK
Create a Prediction
Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.
Get Prediction Result
Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.
Readme
Overview
instant-id — Image-to-Image AI Model
instant-id from Tencent delivers realistic image-to-image transformations, enabling instant generation of photorealistic depictions of real people from reference photos and text prompts. Part of Tencent's instant-id family, this image-to-image AI model excels at preserving facial identity and details, solving the challenge of creating consistent, high-fidelity portraits without extensive training data. Developers and creators searching for Tencent image-to-image solutions find instant-id ideal for rapid, accurate edits in e-commerce and content pipelines.
Powered by Tencent's advanced multimodal architecture, instant-id supports seamless reference-based generation, producing outputs that maintain intricate facial features, expressions, and lighting from input images. This makes it a go-to for image-to-image AI model applications requiring identity consistency across modifications.
Technical Specifications
What Sets instant-id Apart
instant-id stands out in the competitive landscape of image-to-image AI models through its specialized focus on instant identity preservation, leveraging Tencent's Hunyuan-inspired multimodal framework for superior facial fidelity without per-subject fine-tuning.
- Zero-shot identity retention: Maintains precise facial structures, skin tones, and expressions from a single reference image, even under style changes or environmental shifts—enabling reliable person-specific edits that generic models often distort.
- Multimodal token modeling: Processes text and image inputs in a unified autoregressive framework with 80B parameters (13B active MoE), delivering enhanced prompt adherence and world-knowledge integration for contextually accurate transformations.
- Efficient high-resolution output: Supports detailed generations up to high resolutions with fast inference, optimized for real-time workflows like AI photo editing for e-commerce, where speed meets quality.
Unlike diffusion-heavy competitors, instant-id's architecture ensures structural coherence and multilingual prompt handling, making it a top choice for precise instant-id API integrations.
Key Considerations
Prompt Quality: Clear, descriptive prompts lead to better results. Use negative_prompt to explicitly exclude undesired features.
Pose and Depth Control: Ensure pose and depth input images align with the desired output structure for effective conditioning.
Safety Checker: Enabling or disabling the safety checker impacts output filtering. Use discretion when disabling it.
Tips & Tricks
How to Use instant-id on Eachlabs
Access instant-id through Eachlabs' Playground for quick tests with reference images and text prompts, or integrate via API and SDK for production-scale image-to-image AI model workflows. Provide a face image, descriptive prompt, and optional parameters like style strength or resolution; expect high-res PNG/JPG outputs in seconds with preserved identity details. Eachlabs simplifies deployment for all instant-id API needs.
---Capabilities
High-Quality Output
The model excels in generating visually stunning images across diverse styles and resolutions.
Style Adaptability
Choose from a wide array of artistic weights to achieve desired aesthetic outcomes.
Precision Controls
Leverage pose, canny, and depth controls to craft outputs with fine detail and alignment.
What Can I Use It For?
Use Cases for instant-id
E-commerce developers building automated image editing API tools can upload a product photo with a model and prompt "transform this portrait to wear a red evening gown in a sunset beach scene while keeping the exact face and smile," generating catalog-ready visuals instantly without reshoots.
Content creators use instant-id for social media personalization, feeding a selfie plus "age this person 20 years with silver hair and professional attire in an office," to produce hyper-realistic aging effects that preserve unique identity traits for viral storytelling.
Marketers targeting branded campaigns leverage its identity consistency by inputting executive headshots and specifying "place this face on a diverse team in a modern conference room with company logo," streamlining diverse representation without identity loss.
App designers integrating edit images with AI features create avatar customizers, where users provide a photo and describe "add cyberpunk neon tattoos and glowing eyes," yielding coherent, high-quality results for gaming and AR experiences.
Things to Be Aware Of
Generate a photorealistic portrait using stable-diffusion-xl-base-1.0 with fine-tuned controlnet settings.
Experiment with anime-inspired outputs using anime-art-diffusion-xl.
Combine pose control with a well-defined prompt to create dynamic, action-packed scenes.
Adjust guidance_scale and pose_strength to observe how the model interprets intricate instructions.
Limitations
Performance Variability: Results may vary significantly based on input prompt and style selection.
Pose Limitations: Poorly aligned or low-quality pose images can reduce output fidelity.
Complex Scenes: Highly intricate prompts may result in unexpected outputs or artifacts.
Controlnet Dependencies: Overuse of controlnets can sometimes overly constrain the creative potential of the model.
Output Format: PNG
Pricing
Pricing Detail
This model runs at a cost of $0.001080 per second.
The average execution time is 32 seconds, but this may vary depending on your input data.
The average cost per run is $0.034560
Pricing Type: Execution Time
Cost Per Second means the total cost is calculated based on how long the model runs. Instead of paying a fixed fee per run, you are charged for every second the model is actively processing. This pricing method provides flexibility, especially for models with variable execution times, because you only pay for the actual time used.
Related AI Models
You can seamlessly integrate advanced AI capabilities into your applications without the hassle of managing complex infrastructure.
