each::sense is live
Eachlabs | AI Workflows for app builders
idm-vton

IDM-VTON

IDM VTON is best-in-class clothing virtual try-on in the wild (non-commercial use only)

Avg Run Time: 26.000s

Model Slug: idm-vton

Playground

Input

Enter a URL or choose a file from your computer.

Enter a URL or choose a file from your computer.

Advanced Controls

Output

Example Result

Preview and download your result.

Preview
The total cost depends on how long the model runs. It costs $0.001540 per second. Based on an average runtime of 26 seconds, each run costs about $0.0400. With a $1 budget, you can run the model around 24 times.

API & SDK

Create a Prediction

Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.

Get Prediction Result

Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.

Readme

Table of Contents
Overview
Technical Specifications
Key Considerations
Tips & Tricks
Capabilities
What Can I Use It For?
Things to Be Aware Of
Limitations

Overview

idm-vton — Image-to-Image AI Model

Transform any photo into a perfect outfit showcase with idm-vton, Alibaba's best-in-class clothing virtual try-on model designed for realistic garment swapping in unconstrained real-world settings. Developed as part of the idm-vton family, this image-to-image AI model excels at preserving garment details, human poses, and body shapes without requiring controlled studio conditions—ideal for "AI virtual try-on" searches. Whether you're testing fashion designs or visualizing customer looks, idm-vton delivers photorealistic results that outperform traditional methods, making it a go-to for "virtual clothing try-on AI".

Powered by advanced diffusion models, idm-vton handles diverse clothing items like dresses, jackets, and accessories, supporting inputs up to 1024x768 resolution for sharp, high-fidelity outputs. Users love its ability to work with casual smartphone photos, as seen in community examples where wrinkled shirts or patterned fabrics render flawlessly.

Technical Specifications

What Sets idm-vton Apart

The idm-vton image-to-image AI model from Alibaba stands out in the competitive landscape of virtual try-on tools by achieving state-of-the-art performance on in-the-wild benchmarks like VITON-HD and DressCode, where it surpasses models like StableVITON with superior texture preservation and pose alignment. This enables seamless garment transfers on diverse body types and lighting conditions, reducing artifacts that plague generic image-to-image AI models.

Unlike many competitors limited to frontal poses or simple apparel, idm-vton supports multi-view agnostic try-ons with high-resolution outputs up to 1024x768 and average processing times under 10 seconds on standard GPUs. Developers integrating the idm-vton API benefit from its efficiency in e-commerce pipelines, handling real-time previews without compromising quality.

  • Wild-scene robustness: Processes unconstrained images with occlusions or complex backgrounds, delivering try-ons that maintain fabric folds and lighting consistency—perfect for "AI photo editing for fashion".
  • Precise garment preservation: Retains intricate patterns, logos, and textures from reference clothes, enabling accurate virtual fitting for branded merchandise.
  • Flexible input handling: Accepts person image plus garment photo, with optional text prompts for style tweaks, supporting common formats like PNG and JPG.

These capabilities make idm-vton the top choice for "Alibaba image-to-image" applications demanding realism over speed alone.

Key Considerations

Garment Fit: IDM VTON does not adjust for physical garment fit; ensure input images represent the desired style.

Background Compatibility: Transparent or plain backgrounds yield the best results, minimizing distractions in the final output.

Lighting Consistency: Match lighting conditions in the garment and human images to maintain realistic compositing

Tips & Tricks

How to Use idm-vton on Eachlabs

Access idm-vton seamlessly on Eachlabs via the Playground for instant testing—upload a person image, garment reference, and optional text prompt like "casual summer vibe," then generate high-res outputs in seconds. Integrate through the idm-vton API or SDK for apps, specifying parameters like resolution (up to 1024x768) and output format (PNG/JPG). Eachlabs delivers fast, scalable access to this Alibaba powerhouse for all your virtual try-on needs.

---

Capabilities

  • Visualize how garments appear on a person in various categories (upper_body, lower_body, dresses).
  • Create marketing visuals, fashion catalog content, and personalized styling previews with IDM VTON.
  • Enhance the shopping experience by offering a realistic virtual try-on solution.

What Can I Use It For?

Use Cases for idm-vton

Fashion e-commerce developers can build dynamic product pages using the idm-vton API: upload a customer selfie and catalog garment image to generate personalized try-ons, boosting conversion rates without physical samples. This "AI virtual try-on for online shopping" workflow handles thousands of variants daily.

Content creators and influencers experiment with outfits by feeding idm-vton a base photo and reference clothing, like "swap my jeans for high-waisted denim with rips on a beach walk pose." The model's wild-scene handling ensures natural results even in outdoor shots, streamlining pre-shoot planning.

Apparel designers iterate prototypes rapidly—provide a model photo plus fabric swatch to visualize fits across body types. For "virtual clothing try-on AI" in design software, idm-vton preserves details like embroidery, accelerating feedback loops from concept to mockup.

Marketing teams create diverse campaign visuals by applying seasonal collections to stock models, supporting "image-to-image AI model" integrations for A/B testing ad creatives with realistic personalization.

Things to Be Aware Of

Layered Outfits: Experiment with different garments sequentially for a layered styling effect on IDM VTON.

Customization:

  • Use the steps slider to explore varying levels of detail and refinement.
  • Adjust the crop and mask_only settings for focused outputs.

Creative Uses:

  • Use force_dc to emphasize garment details like embroidery or unique textures.
  • Test with diverse human images, including different poses and body types.

Realistic Outputs:

  • Pair similar lighting conditions between garment and human images for consistency.
  • Use high-quality garment masks to maintain edge precision and clarity.

Limitations

Complex Garments: Intricate patterns or transparent fabrics may not render perfectly.

Pose Variations: Extreme poses in human images can sometimes lead to artifacts.

Multiple Garments: The model supports a single garment per operation. For multi-layered styling, run the model sequentially.

Output Format JPG

Pricing

Pricing Detail

This model runs at a cost of $0.001540 per second.

The average execution time is 26 seconds, but this may vary depending on your input data.

The average cost per run is $0.040040

Pricing Type: Execution Time

Cost Per Second means the total cost is calculated based on how long the model runs. Instead of paying a fixed fee per run, you are charged for every second the model is actively processing. This pricing method provides flexibility, especially for models with variable execution times, because you only pay for the actual time used.