Voice Cloning Best Practices
Recording tips
Environment
- Record in a quiet room — no AC hum, traffic, or background noise
- Soft furnishings (carpets, curtains) reduce echo
- Avoid recording near windows or hard walls
Microphone
- A good USB mic gives significantly better results than a phone
- If using a phone, record in a closet full of clothes (natural sound dampening)
- Keep distance consistent — stay 15–20cm from the mic
Delivery
- Speak naturally at a normal pace
- Vary your sentences — don't read a list monotonously
- Include questions, statements, and excited sentences for a well-rounded clone
- Avoid whispering or shouting
Ideal recording content
Read a short paragraph naturally — something like a news article or product description works well. Aim for 30–45 seconds of clean speech.
Common mistakes
| Mistake | Effect | Fix |
|---|---|---|
| Background music | Poor separation | Re-record without music |
| Multiple speakers | Confused clone | Use single-speaker recordings only |
| Heavy compression (phone call quality) | Robotic output | Use uncompressed WAV |
| Very short sample (under 10s) | Thin, inconsistent voice | Record at least 20 seconds |
| Lots of "ums" and "ahs" | Unnatural clone | Edit them out before uploading |
After cloning
- Test with short sentences first
- Try different text types (questions, statements, emotional lines)
- If quality isn't great, re-record with a better sample


