Menu Menu

New AI tool recreates faces solely through voice data

As deep fake technology becomes increasingly difficult to suss out online, from AI voices to celebrity lookalikes, a new tool has allowed researchers to recreate faces through voice recordings.

The era of deepfakes and artificial personas is steadily creeping up on us, one technological breakthrough at a time.

While you may have seen some uncanny TikTok accounts creating deepfake videos of celebrities such as Tom Cruise, and celebrity AI voice generators such as Uberduck, a new research tool developed at MIT recreates the face of a real person using nothing but their voice.

The results so far are fairly mixed – some get ethnicities, genders, and face structures all mixed up – but there have been accurate samples that show promise for potential use in the future.

The algorithm is called Speech2Face and was part of a research paper first published in 2019. A demo is available online if you’re curious to check it out for yourself.

Faces seem to be more accurately recreated with longer audio clips, which shouldn’t come as much of a surprise. The code was created using millions of videos from YouTube, with the software modelled by learning ‘audio-visual and voice-face correlations’ from a wide-range of samples.

It’s still a work in progress, of course, so it isn’t completely on point every time. The potential for a system that registers voices and identifies individuals quickly could be huge, particularly within legal systems and surveillance companies.

Researches behind the tech are adamant that it is only for scientific purposes, but we already know that larger companies – like Facebook, Google, Amazon, and a bunch more – are already very interested in advanced Metaverse programmes, Web 3.0, and harvesting user data. An ability to identify anyone quickly like this could be devastating in the wrong hands.

DIY Photography also points out that software like this could put the identities of influencers at risk, especially those who keep their faces hidden. TikTokers or YouTubers that make a deliberate effort to mask their identity could be discovered through audio snippets of their voices, from any clip they’ve ever posted.

Still, that’s likely far off in the future, as the algorithm is privative at present. It seems we’ll have to accept a future where AI and deepfake technology blurs the line between real and artificial, with misinformation likely to remain rampant and harder to stamp out.

Detecting identities through brief voice clips is simply another step along an inevitable path. Let’s just hope things don’t spiral out of control.

Accessibility