Microsoft VALL-E AI Can Clone Your Voice From 3-Second Audio Clip


  • Microsoft announced it is working on a text-to-speech artificial intelligence tool. 
  • VALL-E can clone someone’s voice from a 3-second audio clip and use it to synthesize other words.
  • It came as the tech giant plans to invest $10 billion in OpenAI’s writing tool ChatGPT. 

Microsoft, which has plans to invest $10 billion in ChatGPT, is working on an artificial intelligence called VALL-E that can clone someone’s voice from a three-second audio clip. 

VALL-E, trained with 60,000 hours of English speech, is capable of mimicking a voice in “zero-shot scenarios”, meaning the AI tool can make a voice say words it has never heard the voice say before, according to a paper in which the developers introduced the tool.

VALL-E uses text-to-speech technology to convert written words into spoken words in “high-quality personalized” speeches, according to the 16-page paper.

It used recordings of more than 7,000 real speakers from LibriLight– an audiobook dataset made up of public-domain texts read by volunteers – to conduct its sampling. The tech giant released samples of how VALL-E would work, showcasing how the voice of a speaker is cloned.

The AI tool is not currently available for public use and Microsoft hasn’t made it clear what its intended purpose is. 

Sharing their findings on the academic site arXiv, the researchers said the results so far showed that VALL-E “significantly outperforms” the most advanced systems of its kind, “in terms of speech naturalness and speaker similarity.”

But they pointed out the lack of diversity of accents among speakers, and that some words in the synthesized speech were “unclear, missed, or duplicated.”

They also included an ethical warning about VALL-E and its risks, saying the tool could be misused, for example in “spoofing voice identification or impersonating a specific speaker”.

“To mitigate such risks, it is possible to build a detection model to discriminate whether an audio clip was synthesized by VALL-E,” the developers wrote in the paper. They didn’t give details of how this could be done.

They added that “if the model is generalized to unseen speakers in the real world, it should include a protocol to ensure that the speaker approves the use of their voice.”

Meanwhile, Microsoft announced Monday it will make OpenAI’s ChatGPT available to its own services and is reportedly in talks to invest $10 billion in the AI writing tool

While ChatGPT has inspired creativity, such as for a man who wrote a children’s book in one weekend with it, it has raised concerns about whether the tool can be trustworthy.

Microsoft didn’t immediately respond to a request for comment by Insider. 

Correction: January 19, 2023 — An earlier version of this story misstated the organisation that published the paper about VALL-E. It was published by researchers for Microsoft on the academic site arXiv.



Source link: https://www.businessinsider.com/microsoft-chatgpt-vall-e-valle-voice-text-clone-listen-clip

Sponsors

spot_img

Latest

Galaxy Fold 5 and Z Flip 5: What are the new colors?

At today's Samsung Unpacked, the tech giant unveiled their new line of mobile phones: the Galaxy Z Flip5 and Galaxy Z Fold5 (to...

How Kings fans surprised Malik Monk in debut season at rowdy Golden 1 Center

How Kings fans surprised Monk in debut Sacramento season originally appeared on NBC Sports BayareaMalik Monk wasn't sure what to expect when he...

Twitter pulls out of EU’s voluntary Code of Practice against disinformation

Twitter has withdrawn from a voluntary European Union agreement to combat online disinformation. In a tweet spotted by , Thierry Breton, the bloc’s...

Exeter boss Rob Baxter has issued a sin-bin warning to football

Exeter rugby director Rob Baxter has urged football’s law makers to be careful after they agreed that sin-bins should be trialled at...

Manchester United need Rasmus Hojlund, Tottenham should sign Edmond Tapsoba, and Arsenal target identified – who Big Six clubs must target

There's a little over a month to go until the transfer window closes, and you can be sure that there will be more...