Voice Changer

T4 16GB

Fast Inference

REST API

Model Information

Response Time:~143 sec

Status:Active

Version:

0.0.1

Updated:5 months ago

realistic-voice-cloning

Live Demo

Average runtime: ~143 seconds

Input

Configure model parameters

rvc_model

The specific RVC model used for the function.

Custom Rvc Model Download Url

URL to download a custom RVC model.

Enter your custom rvc model download url

Song Input

The original song file provided as input.

File upload is currently disabled

Output

View generated results

Result

Preview, share or download your results with a single click.

Cost is calculated based on execution time.The model is charged at $0.0002475 per second. With a $1 budget, you can run this model approximately 28 times, assuming an average execution time of 143 seconds per run.

API Reference

View Full Documentation

Prerequisites

Create an API Key from the Eachlabs Console
Install the required dependencies for your chosen language (e.g., requests for Python)

API Integration Steps

1. Create a Prediction

Send a POST request to create a new prediction. This will return a prediction ID that you'll use to check the result. The request should include your model inputs and API key.

import requests
import time

API_KEY = "YOUR_API_KEY"  # Replace with your API key
HEADERS = {
    "X-API-Key": API_KEY,
    "Content-Type": "application/json"
}

def create_prediction():
    response = requests.post(
        "https://api.eachlabs.ai/v1/prediction/",
        headers=HEADERS,
        json={
            "model": "realistic-voice-cloning",
            "version": "0.0.1",
            "input": {
  "protect": 0.33,
  "rvc_model": "Squidward",
  "index_rate": 0.5,
  "song_input": "your_file.audio/mp3",
  "reverb_size": 0.15,
  "pitch_change": "no-change",
  "rms_mix_rate": 0.25,
  "filter_radius": 3,
  "output_format": "mp3",
  "reverb_damping": 0.7,
  "reverb_dryness": 0.8,
  "reverb_wetness": 0.2,
  "crepe_hop_length": 128,
  "pitch_change_all": 0,
  "main_vocals_volume_change": 0,
  "pitch_detection_algorithm": "rmvpe",
  "instrumental_volume_change": 0,
  "backup_vocals_volume_change": 0,
  "custom_rvc_model_download_url": "your custom rvc model download url here"
},
            "webhook_url": ""
        }
    )
    prediction = response.json()
    
    if prediction["status"] != "success":
        raise Exception(f"Prediction failed: {prediction}")
    
    return prediction["predictionID"]

2. Get Prediction Result

Poll the prediction endpoint with the prediction ID until the result is ready. The API uses long-polling, so you'll need to repeatedly check until you receive a success status.

def get_prediction(prediction_id):
    while True:
        result = requests.get(
            f"https://api.eachlabs.ai/v1/prediction/{prediction_id}",
            headers=HEADERS
        ).json()
        
        if result["status"] == "success":
            return result
        elif result["status"] == "error":
            raise Exception(f"Prediction failed: {result}")
        
        time.sleep(1)  # Wait before polling again

3. Complete Example

Here's a complete example that puts it all together, including error handling and result processing. This shows how to create a prediction and wait for the result in a production environment.

try:
    # Create prediction
    prediction_id = create_prediction()
    print(f"Prediction created: {prediction_id}")
    
    # Get result
    result = get_prediction(prediction_id)
    print(f"Output URL: {result['output']}")
    print(f"Processing time: {result['metrics']['predict_time']}s")
except Exception as e:
    print(f"Error: {e}")

Additional Information

The API uses a two-step process: create prediction and poll for results
Response time: ~143 seconds
Rate limit: 60 requests/minute
Concurrent requests: 10 maximum
Use long-polling to check prediction status until completion

Overview

The Voice Changer model is designed to transform input audio by altering various vocal characteristics, enabling users to modify aspects such as pitch, timbre, and apply effects like reverb. This model is particularly useful for creating unique vocal renditions, generating character voices, or enhancing audio content with specific stylistic attributes

Technical Specifications

Pitch Detection Algorithms: Utilizes algorithms like rmvpe and mangio-crepe for accurate pitch analysis.

Reverb Effects: Offers customizable reverb settings, including size, wetness, dryness, and damping, to enhance the spatial quality of the audio.

Volume Control: Allows independent adjustment of main vocals, backup vocals, and instrumental volumes for balanced mixing

Key Considerations

Parameter Sensitivity: Small changes in parameters like index_rate and filter_radius can significantly impact the output. It's advisable to make incremental adjustments and review the results.

Model Compatibility: When using a custom_rvc_model_download_url, ensure that the Voice Changer is compatible and properly formatted to avoid processing errors.

Resource Consumption: Processing complex transformations may require substantial computational resources, which could affect processing time

Tips & Tricks

rvc_model

Selection: Choose from predefined models such as Squidward, MrKrabs, Plankton, Drake, Vader, Trump, Biden, Obama, Guitar, Violin, or select CUSTOM to upload a personalized model.

pitch_change

Options:
- no-change: Maintains the original pitch.
- male-to-female: Raises the pitch to simulate a female voice.
- female-to-male: Lowers the pitch to simulate a male voice.

index_rate

Range: 0 to 1
Recommendation: Start with a default value of 0.5. Increase towards 1 to retain more of the original accent or decrease towards 0 to apply more of the Voice Changer's characteristics.

filter_radius

Range: 0 to 7
Recommendation: A higher value results in smoother outputs but may reduce detail. A value around 3 is a good starting point.

rms_mix_rate

Range: 0 to 1
Recommendation: Adjust to balance the root mean square (RMS) levels between the original and transformed audio. A value of 0.5 often provides a natural blend.

pitch_detection_algorithm

Options:
- rmvpe: Suitable for general purposes with a good balance between speed and accuracy.
- mangio-crepe: Offers higher accuracy, especially for complex audio, but may require more processing power.

protect

Range: 0 to 1
Recommendation: Use this parameter to protect certain frequencies from transformation. A value of 0.5 protects mid-range frequencies, which can help maintain vocal clarity.

Reverb Settings

reverb_size: Controls the perceived size of the space. A value of 0.5 simulates a medium-sized room.
reverb_wetness: Adjusts the amount of reverb effect applied. A higher value increases the effect.
reverb_dryness: Controls the presence of the original signal. Lower values reduce the dry signal, making the reverb more prominent.
reverb_damping: Affects the decay of high frequencies. Higher values result in a warmer sound.

Capabilities

Transform Vocal Characteristics: Modify pitch, timbre, and apply effects to alter the original voice.

Create Character Voices with Voice Changer: Generate distinctive voices for characters in media productions.

Enhance Audio Content: Apply stylistic effects to improve or change the mood of audio recordings

What can I use for?

Content Creation: Enhance podcasts, videos, and other media by altering vocal elements to fit specific themes or characters.

Entertainment: Create parody songs, voiceovers, or unique renditions of existing audio content.

Educational Purposes: Demonstrate the effects of audio processing and voice transformation in academic settings

Things to be aware of

Experiment with different rvc_model options to achieve unique vocal transformations.

Use pitch_change settings to shift between male and female voices smoothly.

Adjust index_rate (0-1) to balance between clarity and transformation strength.

Modify filter_radius (0-7) to fine-tune the smoothness of the audio.

Try different pitch_detection_algorithm options (rmvpe, mangio-crepe) to see which works best for your audio.

Use reverb_size, reverb_wetness, and reverb_dryness for ambient effects.

Increase protect (0-1) if artifacts or distortions appear in the output.

Adjust main_vocals_volume_change and backup_vocals_volume_change to control the vocal balance.

Limitations

Model Dependency: The quality of the output heavily depends on the selected rvc_model and its compatibility with the input audio.

Voice Changer Processing Time : Complex transformations or high-resolution audio files may lead to longer processing times.

Audio Artifacts: Extreme parameter settings can introduce artifacts or unnatural sounds into the output.

Output Format: MP3

Related AI Models

You can seamlessly integrate advanced AI capabilities into your applications without the hassle of managing complex infrastructure.

Open Voice

Updated to OpenVoice v2: Versatile Instant Voice Cloning

XTTS

XTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 6-second audio clip.