inference · 2.8s

Voice Changer

Audio·eachlabs·by eachlabs

Create song covers with any RVC v2 trained AI voice from audio files.

Try it now →

API reference

Runtime (p50): 2m
Estimated price: $0.000247 / sec

Call the API

prediction.sh

curl -X POST \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "realistic-voice-cloning",
    "version": "0.0.1",
    "input": {
        "protect": 0.33,
        "rvc_model": "Squidward",
        "index_rate": 0.5,
        "song_input": "https://cdn.eachlabs.ai/ipfs/JsPIizFfRy54Jk5LuXdnrNdV1JHJ6oLmPPdRuIfh3lvpoNai/gangnam.mp3",
        "reverb_size": 0.15,
        "pitch_change": "no-change",
        "rms_mix_rate": 0.25,
        "filter_radius": 3,
        "output_format": "mp3",
        "reverb_damping": 0.7,
        "reverb_dryness": 0.8,
        "reverb_wetness": 0.2,
        "crepe_hop_length": 128,
        "pitch_change_all": 0,
        "main_vocals_volume_change": 10,
        "pitch_detection_algorithm": "rmvpe",
        "instrumental_volume_change": 0,
        "backup_vocals_volume_change": 0
    },
    "webhook_url": ""
}' \
  https://api.eachlabs.ai/v1/prediction/

Documentation8 sections

Overview
realistic-voice-cloning — Voice-to-Voice AI Model

Developed by eachlabs as part of the eachlabs family, realistic-voice-cloning empowers creators to generate song covers and custom audio using any RVC v2 trained AI voice from input audio files, solving the challenge of affordable, high-fidelity voice transformation for music and content production. This voice-to-voice AI model excels in cloning realistic voices with minimal input, delivering professional-grade outputs ideal for musicians, podcasters, and developers seeking realistic voice cloning tools. Unlike traditional recording sessions, it transforms existing audio into new performances while preserving vocal nuances and style.
Capabilities
Transform Vocal Characteristics: Modify pitch, timbre, and apply effects to alter the original voice.
Create Character Voices with Voice Changer: Generate distinctive voices for characters in media productions.
Enhance Audio Content: Apply stylistic effects to improve or change the mood of audio recordings
Use cases
Use Cases for realistic-voice-cloning

Musicians and cover artists use realistic-voice-cloning to reimagine hits with cloned voices, feeding an input track like a pop song alongside an RVC v2 model of a favorite singer to output a flawless cover in seconds—perfect for YouTube channels or TikTok trends seeking AI song covers.

Content creators building podcasts or audiobooks turn personal recordings into professional narrations by cloning premium voices, preserving the original script's timing while adding emotional depth via RVC v2 training, streamlining production for weekly episodes.

Developers integrating realistic-voice-cloning API into apps for personalized music experiences upload user audio samples to generate custom song versions, leveraging short-sample cloning for on-the-fly voice swaps in karaoke or virtual idol platforms.

Marketers crafting branded audio ads clone spokesperson voices onto scripts, using the model's harmonic fidelity to ensure singable jingles that match campaign tones, enhancing engagement without hiring talent.
Tips & tricks
How to Use realistic-voice-cloning on Eachlabs

Access realistic-voice-cloning through Eachlabs Playground by uploading an audio file, selecting an RVC v2 trained voice model, and specifying output duration or style—generate high-fidelity MP3/WAV clones instantly. For production, integrate via the realistic-voice-cloning API or SDK with parameters like reference_audio_url and voice_id; outputs deliver 44.1kHz quality ready for download or streaming.
---
Technical spec
What Sets realistic-voice-cloning Apart

The realistic-voice-cloning model stands out in the voice-to-voice AI landscape through its specialized support for RVC v2 trained voices, enabling seamless conversion of input audio into song covers with exceptional timbre accuracy and emotional expressiveness. This capability allows users to apply pre-trained celebrity or custom voices to any track, producing outputs that rival studio recordings without retraining models.
- RVC v2 compatibility: Directly leverages Retrieval-based Voice Conversion v2 models for instant voice swaps, supporting a vast library of community-trained voices that maintain pitch, breathing, and inflection from source audio. This enables rapid prototyping of covers in formats like MP3 or WAV, with processing times under 30 seconds for short clips.
- Audio-to-song cover focus: Optimized for music applications, it handles singing voices with harmonic preservation, outperforming general TTS in vocal range and resonance for genres from pop to opera. Users gain production-ready tracks ready for distribution, bypassing expensive vocalists.
- High-fidelity cloning from short samples: Requires only 10-30 seconds of reference audio to generate convincing clones, with low latency ideal for real-time previews in apps. This differentiator supports eachlabs voice-to-voice workflows for scalable content like personalized audiobooks or demos.
Technical specs include input formats like WAV/MP3, output up to 48kHz stereo, and average processing under 60 seconds, making it a top choice for AI voice cloning for songs.
Things to be aware of
Experiment with different rvc_model options to achieve unique vocal transformations.
Use pitch_change settings to shift between male and female voices smoothly.
Adjust index_rate (0-1) to balance between clarity and transformation strength.
Modify filter_radius (0-7) to fine-tune the smoothness of the audio.
Try different pitch_detection_algorithm options (rmvpe, mangio-crepe) to see which works best for your audio.
Use reverb_size, reverb_wetness, and reverb_dryness for ambient effects.
Increase protect (0-1) if artifacts or distortions appear in the output.
Adjust main_vocals_volume_change and backup_vocals_volume_change to control the vocal balance.
Key considerations
Parameter Sensitivity: Small changes in parameters like index_rate and filter_radius can significantly impact the output. It's advisable to make incremental adjustments and review the results.
Model Compatibility: When using a custom_rvc_model_download_url, ensure that the Voice Changer is compatible and properly formatted to avoid processing errors.
Resource Consumption: Processing complex transformations may require substantial computational resources, which could affect processing time
Limitations
Model Dependency: The quality of the output heavily depends on the selected rvc_model and its compatibility with the input audio.
Voice Changer Processing Time : Complex transformations or high-resolution audio files may lead to longer processing times.
Audio Artifacts: Extreme parameter settings can introduce artifacts or unnatural sounds into the output.
Output Format: MP3