Voice-mimicking software reportedly used in a major theft?!

Thieves used voice-mimicking software to imitate a company executive’s speech and dupe his subordinate into sending hundreds of thousands of dollars to a secret account, the company’s insurer said, in a remarkable case that some researchers are calling one of the world’s first publicly reported artificial-intelligence heists.

The managing director of a British energy company, believing his boss was on the phone, followed orders one Friday afternoon in March to wire more than $240,000 to an account in Hungary, said representatives from the French insurance giant Euler Hermes, which declined to name the company.

The request was rather strange, the director noted later in an email, but the voice was so lifelike that he felt he had no choice but to comply. The insurer, whose case was first reported by the Wall Street Journal, provided new details on the theft to The Washington Post on Wednesday, including an email from the employee tricked by what the insurer is referring to internally as the false Johannes.

Now being developed by a wide range of Silicon Valley titans and AI start-ups, such voice-synthesis software can copy the rhythms and intonations of a person’s voice and be used to produce convincing speech. Tech giants such as Google and smaller firms such as the ultrarealistic voice cloning start-up Lyrebird have helped refine the resulting fakes and made the tools more widely available free for unlimited use.

New technologies and losing the trust 

But the synthetic audio and AI-generated videos, known as deepfakes, have fueled growing anxieties over how the new technologies can erode public trust, empower criminals and make traditional communication — business deals, family phone calls, presidential campaigns — that much more vulnerable to computerized manipulation.

Criminals are going to use whatever tools enable them to achieve their objectives cheapest, said Andrew Grotto, a fellow at Stanford University’s Cyber Policy Center and a senior director for cybersecurity policy at the White House during the Obama and Trump administrations.

This is a technology that would have sounded exotic in the extreme 10 years ago, now being well within the range of any lay criminal who’s got creativity to spare, Grotto added.

Developers of the technology have pointed to its positive uses, saying it can help humanize automated phone systems and help mute people speak again. But its unregulated growth has also sparked concern over its potential for fraud, targeted hacks and cybercrime.

At least three voice-mimicking frauds

Researchers at the cybersecurity firm Symantec said they have found at least three cases of executives’ voices being mimicked to swindle companies. Symantec declined to name the victim companies or say whether the Euler Hermes case was one of them, but it noted that the losses in one of the cases totaled millions of dollars.

The systems work by processing a person’s voice and breaking it down into components, like sounds or syllables, that can then be rearranged to form new phrases with similar speech patterns, pitch and tone. The insurer did not know which software was used, but a number of the systems are freely offered on the Web and require little sophistication, speech data or computing power.

Lyrebird, for instance, advertises the most realistic artificial voices in the world and allows anyone to create a voice-mimicking vocal avatar by uploading at least a minute of real-world speech.

Adjusting to increased development of technologies

The company, which did not respond to requests for comment, has defended releasing the software widely, saying it will help acclimate people to the new reality of a fast-improving and inevitable technology so that society can adapt. In an ethics statement, the company wrote Imagine that we had decided not to release this technology at all. Others would develop it and who knows if their intentions would be as sincere as ours.

Saurabh Shintre, a senior researcher who studies such adversarial attacks in Symantec’s California-based research lab, said the audio-generating technology has in recent years made transformative progress because of breakthroughs in how the algorithms process data and compute results. The amount of recorded speech needed to train the voice-impersonating tools to produce compelling mimicries, he said, is also shrinking rapidly.

The technology is imperfect, and some of the faked voices wouldn’t fool a listener in a calm, collected environment, Shintre said. But in some cases, thieves have employed methods to explain the quirks away, saying the fake audio’s background noises, glitchy sounds or delayed responses are the result of the speaker’s being in an elevator or car or in a rush to catch a flight.

Putting pressure – an eternal tactic of deception

Beyond the technology’s capabilities, the thieves have also depended on age-old scam tactics to boost their effectiveness, using time pressure, such as an impending deadline, or social pressure, such as a desire to appease the boss, to make the listener move past any doubts. In some cases, criminals have targeted the financial gatekeepers in company accounting or budget departments, knowing they may have the capability to send money instantly.

When you create a stressful situation like this for the victim, their ability to question themselves for a second — Wait, what the hell is going on? Why is the CEO calling me? — goes away, and that lets them get away with it, Shintre said.

Euler Hermes representatives said the company, a German energy firm’s subsidiary in Britain, contacted law enforcement but has yet to name any potential suspects.

The insurer, which sells policies to businesses covering fraud and cybercrime, said it is covering the company’s full claim.

The victim director was first called late one Friday afternoon in March, and the voice demanded he urgently wire money to a supplier in Hungary to help the company avoid late-payment fines. The fake executive referred to the director by name and sent the financial details by email.

The fraud unveiled

The director and his boss had spoken directly a number of times, said Euler Hermes spokeswoman Antje Wolters, who noted that the call was not recorded. The software was able to imitate the voice, and not only the voice: the tonality, the punctuation, the German accent, she said.

After the thieves made a second request, the director grew suspicious and called his boss directly. Then the thieves called back, unraveling the ruse The fake Johannes was demanding to speak to me whilst I was still on the phone to the real Johannes! the director wrote in an email the insurer shared with The Post.

The money, totaling 220,000 euros, was funneled through accounts in Hungary and Mexico before being scattered elsewhere, Euler Hermes representatives said. No suspects have been named, the insurer said, and the money has disappeared.

There’s a tension in the commercial space between wanting to make the best product and considering the bad applications that product could have, said Charlotte Stanton, the director of the Silicon Valley office of the Carnegie Endowment for International Peace. Researchers need to be more cautious as they release technology as powerful as voice-synthesis technology, because clearly it’s at a point where it can be misused.