Not a week goes by without malicious digital forgery making the headlines. In January alone, a fake “robocall” voicemail from Joe Biden was sent to 25,000 mobiles in New Hampshire, urging recipients not to vote in the primary election—and a fake video of Taylor Swift superimposed onto a porn scene went viral. In Hong Kong, meanwhile, police reported that an employee at a finance company had been tricked into transferring $25m to fraudsters attending a video call, with what turned out to be digitally rendered fakes of his CFO and work colleagues.
These are just some examples of what are called deepfakes, which are made using audio manipulation and facial mapping technology to replace one person’s likeness and voice with that of another. The “deep” refers to the multiple layers of processing used in “deep learning”, a type of machine-learning based on generative neural networks.
The word first appeared in writing in 2017 as the plural “deepfakes”, which was the name of a Reddit user who had pasted the face of actor Gal Gadot onto a porn video. The malicious use of deepfakes was soon reported by Vice journalist Samantha Cole, in an article titled: “AI-assisted Fake Porn is Here and We’re All Fucked”.
But like most new words, “deepfake” was spoken long before it was written. In 2015, I taught a class at Stanford University on words and language. I asked the students to spend a week recording new words they heard on campus. I expected them to collect 40; they came back with 340. Many of them were popular slang terms—simp, slaps, cuck—or words relating to sexual identity, such as demisexual. The word “deepfake” stood out on the list because, the students said, it was not used by them, but by middle-aged male tech workers in Silicon Valley who wanted to undermine the authority of their female colleagues. They did this by secretly photographing them and superimposing their faces on porn videos, which they then watched on their laptops as those same female colleagues gave presentations.
Initially deepfakes were expensive to create, requiring knowledge of sophisticated AI. This prohibited their general use, and “fake deepfakes” soon appeared—poor-quality videos thrown together using inexpensive editing software or by varying the speed of the video, with no help from AI. They became known as dumbfakes, shallowfakes and cheapfakes.
Technology has since advanced to such an extent that now anyone can produce a deepfake on a myriad of websites designed for that purpose. The challenge now is to spot them before they go viral, but it may not be long before it is impossible for humans or machines to distinguish between what is a deepfake and what is real. In that scenario, we might need another new term: deepfake singularity.