Salesforce AI
Voice cloning App
How can we break the black box model of voice cloning and make the process more intuitive and accessible?
Project Detail
01.2026 - 06.2026 (6 months)
Client - sponsored capstone project
My role
I led the end-to-end design, especially on prototyping for the first and second round of user testing and refined the final product with branded interactions.

Voice cloning as blackbox
Voice cloning systems are powerful but opaque. Even the most popular models from Hume AI and ElevenLabs are under explained. You record your voice, submit it to a model, and receive a clone, but what happens in between is a black box.
Client context
Salesforce AI Research team was interested in this specific "black box" problem, since they had potential internal applications, such as creating more brand-aligned voices for Sales and Customer Support functions.
To make system that works well with "anyone" regardless of their level of ai literacy, they wanted to explorr a more interpretable, human-centered voice cloning system, one that gives users greater control and transparency throughout the process.
What we made
Know what you've captured
Voice cloning process becomes transparent. See your pitch, pace, and pauses fill in as you record. No guessing if you're done.

Hear the difference, name it
Voice cloning process becomes collaborative. It is transparent about what it's learning, guided in how it helps you improve, and built to iterate with you, not just for you.
Point. Describe. Done.
Voice cloning becomes contextual and collaborative. Highlight what feels wrong, tell AI how, and get a targeted fix, better fit to your context.
Impact



