XTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 6-second audio clip.