AI speech synthesisers may feel like a novel gimmick now, but as the technology becomes more sophisticated, could we see a full-scale integration into the marketing world?
Apparently, AI audio synthesisers are about to get a whole load more sophisticated than Eminemβs rap generator. Like thatβs even possible.
Despite the many ethical issues surrounding the recreational use of AI and its potential for nefarious uses – with non-consensual deepfake porn, doctored political misinformation, and modified satellite imagery among the main offenders of 2021 – there are exciting possibilities for it to break into mainstream industries sooner than expected.
On that front, all evidence points towards entertainment as the most promising avenue for the technology.
Only recently, weβve heard of Spotifyβs patent for machine learning which will use audio cues in our environment to recommend music based on our moods, and production house Lucasfilm hiring online deepfake artist βShamookβ to help improve its visual effects department.
While a year or two ago, taking a blockbuster movie and βimprovingβ its CGI would likely have led to a cease-and-desist order from its creators, there seems to be a growing acceptance that the technology will become part and parcel of our lives.
As more come around to that notion, there is a feeling among AI experts that the tech may next target the advertising industry. Just imagine synthesised celebrity voices popping up on ad placements or radio idents.
How voice AI works
Much like visual deepfakes, voice AI (or voice synthesis) uses machine learning systems to pull a scattered record of someoneβs voice from multiple data sources.
This collection of raw audio is then run through an algorithm, which uses synthesisers to splice it all together and form a sentence input by the user.
If youβve yet to waste hours making Yoda or David Attenborough spout nonsense, weβll wait here whilst you toy around with one of many free bot programs online. Failing that, check out this Eminem synthesised Mark Zuckerberg diss track.
Like the example above, the majority of voice clone material online is either spoof related or merely an exercise of the technology, though thatβs not to say that it hasnβt already made an appearance in serious commercial projects.
Back in July, a documentary called Roadrunner used voice AI to recreate the vocals of chef Anthony Bourdain and speak out lines heβd written before his death in 2018. As you can imagine, this didnβt go down very well with a big chunk of its viewers.
Further controversy was drummed up a month later, when actor Val Kilmer used an agency called Sonantic to emulate his voice prior to a tracheostomy to remove throat cancer in 2014.
While many praised the technology in the case of Kilmer, Roadrunner was largely viewed as exploitative β particularly as the documentary failed to disclose the use of voice synthesis at all.