Menu Menu

Is voice AI on the verge of a marketing explosion?

AI speech synthesisers may feel like a novel gimmick now, but as the technology becomes more sophisticated, could we see a full-scale integration into the marketing world?

Apparently, AI audio synthesisers are about to get a whole load more sophisticated than Eminem’s rap generator. Like that’s even possible.

Despite the many ethical issues surrounding the recreational use of AI and its potential for nefarious uses – with non-consensual deepfake porn, doctored political misinformation, and modified satellite imagery among the main offenders of 2021 – there are exciting possibilities for it to break into mainstream industries sooner than expected.

On that front, all evidence points towards entertainment as the most promising avenue for the technology.

Only recently, we’ve heard of Spotify’s patent for machine learning which will use audio cues in our environment to recommend music based on our moods, and production house Lucasfilm hiring online deepfake artist ‘Shamook’ to help improve its visual effects department.

https://youtu.be/yK-l4gz4rUU

While a year or two ago, taking a blockbuster movie and ‘improving’ its CGI would likely have led to a cease-and-desist order from its creators, there seems to be a growing acceptance that the technology will become part and parcel of our lives.

As more come around to that notion, there is a feeling among AI experts that the tech may next target the advertising industry. Just imagine synthesised celebrity voices popping up on ad placements or radio idents.


How voice AI works

Much like visual deepfakes, voice AI (or voice synthesis) uses machine learning systems to pull a scattered record of someone’s voice from multiple data sources.

This collection of raw audio is then run through an algorithm, which uses synthesisers to splice it all together and form a sentence input by the user.

If you’ve yet to waste hours making Yoda or David Attenborough spout nonsense, we’ll wait here whilst you toy around with one of many free bot programs online. Failing that, check out this Eminem synthesised Mark Zuckerberg diss track.

Like the example above, the majority of voice clone material online is either spoof related or merely an exercise of the technology, though that’s not to say that it hasn’t already made an appearance in serious commercial projects.

Back in July, a documentary called Roadrunner used voice AI to recreate the vocals of chef Anthony Bourdain and speak out lines he’d written before his death in 2018. As you can imagine, this didn’t go down very well with a big chunk of its viewers.

Further controversy was drummed up a month later, when actor Val Kilmer used an agency called Sonantic to emulate his voice prior to a tracheostomy to remove throat cancer in 2014.

While many praised the technology in the case of Kilmer, Roadrunner was largely viewed as exploitative – particularly as the documentary failed to disclose the use of voice synthesis at all.


A lucrative future for celebs and influencers

The feeling towards mainstream use of synthesised voices and deepfakes in general is far from unanimous, but there is a definite two-way interest from both celebrities and companies to make licences for their use – much in the same way as image rights.

Recognising this, a company called Veritone launched a service earlier this year allowing influencers, athletes, and actors to sell their virtual audio rights for endorsements.

In essence, this allows celebs and influencers to make revenue without having to physically travel to a recording studio or venue, while a paying client reaps the benefits of having their voice on cue.

I’m sure contracts will be more bulletproof than that, but you get the gist.

So long as the talent is happy renting out a simulacrum of themselves, there will almost certainly be future opportunities for big names to cash in.

Bruce Willis, for example, has already licensed his image to be used as a deepfake in Russian mobile phone ads. Making that fact even more dystopian, we’re talking young Willis straight out of the Die Hard era.

In the here and now, applications like Veritone are few and far between, but voice synthesis is already being utilised by podcasts.

One such company, Descript, has created a feature called ‘Overdub’ which allows podcasters to synthesise their own voices. This way, shows and transcripts can be seamlessly edited on the fly.

Talk of any inauthentic content will always be met with concern and criticism, but that isn’t stopping industries from coming around to the idea of AI. On the contrary, it’s becoming more advanced and harder to detect by the day.

It’ll be interesting to who jumps first at these opportunities. I’d be a lot more willing to buy PPI if Patrick Stewart said so.

Accessibility