Currently under its development name Project VoCo, the tech was demonstrated by Adobe developer Zeyu Jin at the Adobe MAX conference in San Diego, California. The program he showed the audience featured a text box which displays all the words uttered in a recorded audio clip in the order they were spoken.
Thanks to VoCo, a user is able to move this text around or delete it in order to edit the clip. Most interestingly, and perhaps most concerningly, VoCo doesn’t just allow you to move around the speech already there, it allows you to construct an entirely new sentence using the recorded voice.
A question of ethics
To do this, VoCo requires around twenty minutes of recorded speech, though Adobe hopes it will be able to reduce this requirement in the future. The software is able to take this voice data and deconstruct it into units of sound (phonemes), which it will then use to try and create an accurate voice model of the speaker.
This means that when you edit the person’s speech to add a brand new word, VoCo is able to either take a previous use of the word from the recording and reapply it or create it from the phonemes. Considering the software is still in its early stages, the results of doing this sound surprisingly natural, though not completely seamless just yet.
This lack of perfection is perhaps a relief for those who see that the VoCo software has the potential to open a big can of ethical worms, like Photoshop before it. It’s an ideal software for audio engineers who work on voiceovers and podcasts as it would allow them to make edits without having to call speakers back for re-recordings. However, there is always the risk that people with less legitimate intentions could make use of it.
Fortunately, Adobe has considered this and says that it’s already working on watermarking and detection software that should hopefully prevent any fraudulent use.
Adobe didn’t give any indication as to when Project VoCo might leave development and be integrated into any of the company’s existing voice software, but we’ll likely have to wait another few years, if it happens at all.
You can see the software in action below: