# XTTS XTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 6-second audio clip. ## API Information - **Model Slug:** xtts-v2 - **Branded URL:** https://www.eachlabs.ai/coqui/xtts/xtts-v2 - **Provider:** Coqui - **Category:** Voice to Voice - **Output Type:** audio - **Status:** active - **Version:** 0.0.1 - **Base Cost:** Per-second pricing based on provider predict_time. Rate: $0.00154/sec from GPU tier. - **Estimated Processing Time:** 20 seconds - **Last Updated:** 2026-04-06 - **Interactive Demo:** https://www.eachlabs.ai/ai-models/xtts-v2 ## Pricing - **Charge Type:** dynamic - **Pricing Details:** Per-second pricing based on provider predict_time. Rate: $0.00154/sec from GPU tier. ### Pricing Rules | Condition | Pricing | | --- | --- | | Rule 1 | Per-second pricing based on provider predict_time. Rate: $0.00154/sec from GPU tier. | ## Input Schema | Parameter | Type | Required | Default | Constraints | Description | |-----------|------|----------|---------|-------------|-------------| | text | string | No | Hello, you are now at Eachlabs AI. If you need any support, just contact us. | - | This is the written input that you want to be converted into spoken words. | | speaker | string | Yes | - | audio/mp3, audio/wav | This determines the specific voice or persona that will speak the provided text. | | language | string | No | en | en,es,fr,de,it,pt,pl,tr,ru,nl,cs,ar,zh,hu,ko,hi | This refers to the choice of language for the text-to-speech synthesis. | | cleanup_voice | boolean | No | true | - | This option helps in refining and improving the quality of the generated speech. | ## Example Request ```bash curl -X POST https://api.eachlabs.ai/v1/prediction/ \ -H "X-API-Key: YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "xtts-v2", "input": { "speaker": "https://storage.googleapis.com/magicpoint/inputs/voice-translate.input.mp3" } }' ``` ## Output Schema Response returned by `GET /v1/prediction/{id}` when the job completes: ```json { "status": "success", "predictionID": "string", "output": "string (URL of generated audio)", "metrics": { "predict_time": "number (seconds)" } } ``` ## Polling ```bash curl https://api.eachlabs.ai/v1/prediction/{PREDICTION_ID} \ -H "X-API-Key: YOUR_API_KEY" ``` | Status | Meaning | |--------|---------| | `processing` | Still running — poll again | | `success` | Done — read `output` | | `error` | Failed — read `message` / `details` | ## Webhook (alternative to polling) Pass `"webhook_url": "https://your.host/path"` in the create request. Eachlabs POSTs this payload when the job ends: ```json { "exec_id": "prediction-uuid", "status": "succeeded", "output": "https://...", "error": "" } ``` `status` is `"succeeded"` or `"failed"`. `exec_id` equals the `predictionID` from create. Return 2xx within 30 seconds. ## Errors Error body: `{ "status": "error", "message": "...", "details": "..." }` | Code | Meaning | |------|---------| | `400` | Invalid input | | `401` | Missing / invalid `X-API-Key` | | `404` | Unknown model or prediction id | | `429` | Rate limit — 100 creates / min, 10 concurrent per key | | `5xx` | Retry with backoff | ## Overview **xtts-v2 — Voice-to-Voice AI Model** xtts-v2, developed by Coqui as part of the XTTS family, is a voice-to-voice AI model that clones voices into different languages using just a 6-second audio clip, enabling high-quality multilingual speech synthesis without extensive training data. This zero-shot voice cloning capability sets xtts-v2 apart in the Coqui voice-to-voice landscape, supporting 17 languages like English, Spanish, Hindi, Dutch, and Russian for seamless cross-language transfer. Ideal for developers seeking **xtts-v2 API** integration or creators exploring **voice-to-voice AI models**, it delivers expressive output with emotion and style preservation, making it a go-to for efficient audio production. ## Usage Notes - API Base URL: `https://api.eachlabs.ai/v1` - Authentication: send `X-API-Key: YOUR_API_KEY`. Generate a key from the Eachlabs dashboard at https://www.eachlabs.ai/dashboard/api-keys. - File-typed parameters (`*_url`, `image_url`, `video_url`, `audio_url`, etc.) accept publicly-reachable HTTPS URLs only. Upload your asset first (GCS / S3 / your CDN) and pass the resulting URL. Data-URIs and localhost URLs are rejected. - For structured parameters (arrays / objects) send real JSON values, not stringified payloads. - Monetary values are reported in USD; per-token / per-megapixel rates may be billed in micro-cents internally. - Prefer `webhook_url` over polling for long-running predictions — see the Webhook Callback section.