Kadho, a company building automatic speech recognition technology to help children communicate with voice-powered devices, is officially exiting stealth today at TechCrunch Disrupt SF 2018 where it’s launching its new technology, Kidsense Edge voice A.I. The company claims its technology can better decode kids’ speech as it was built using speech data from 150,000 children’s voices. The COPPA-compliant solution, which is initially targeting the voice-enabled devices and voice-enabled toys market, is already being used by paying customers.
As anyone with an Echo smart speaker or Google Home can tell you, today’s devices often struggle to understand children’s voices. That’s because current automatic speech recognition technology has been built for adults and was trained on adult voice data.
Kidsense.ai, meanwhile, was built for kids using voices of children from different age groups and speaking different languages. By doing so, it believes it can outperform the big players in the market like Google, Samsung, Baidu, Amazon, and Microsoft, when it comes to understanding children’s speech, the company says.
The company behind the Kidsense AI technology, Kadho, has been around since 2014, and was originally founded by PhDs with backgrounds in A.I. and neuroscience, Kaveh Azartash (CEO) and Dhonam Pemba (Chief Scientist). Chief Revenue Officer, Jock Thompson, is a third co-founder today.
Initially, the company’s focus was on building conversational-based language learning applications for kids.
“But the biggest pain point that we encountered…was that the devices that we were using or apps on – either mobile phones, tablets, robotics, or smart speakers – they’re not built to understand kids,” explains Azartash. He means the speech recognition technology wasn’t built on kids’ data. “They’re not designed to communicate or understand kids.”
The team realized there was a bigger problem to solve. Teaching kids new language using conversational techniques couldn’t work until devices could actually understand the kids. The company shifted to focus instead on speech recognition technology, using a data set of kids voices (which it did with parents’ consent, we’re told), to build Kidsense.
The initial product was a server-based solution called Kidsense cloud AI in late 2017. But more recently, it’s been working on an embedded version of the same platform, where no audio data from kids is collected, and no data is sent to cloud-based servers. This allows the solution to be both COPPA and GDPR-compliant.
This also means it could address the needs of device makers who have been previously come under fire for their less than secure toys and robotics, like Mattel’s Hello Barbie, or its canceled A.I. speaker Aristotle. The idea today is that toy makers, smart speaker manufacturers, and others catering to the kids’ market will need to be compliant with more stringent privacy laws and, to do so, the processing has to be done on the device, not the cloud.
“All the decoding, all the processing is one on the device,” says Azartash. “So we’re able to offer better efficacy and better accuracy in converting speech to text…the technology does not send any speech data to the server.”
“We’ve figured out how to put this all onto the device in an efficient way using minimal processing power,” adds Thompson. “And because we’re embedded we can charge a flat fee depending on the product anywhere to a subscription model.”
For example, a toy company working with thin margins on a product with a really small lifespan might want a flat fee. But another company may have a product with a longer lifespan that they charge their own customers for on subscription. They may want to be able to update their product’s voice tech capabilities over-the-air. That’s also possible here.
The company says its technology is in several toys, robotics, and A.I. speaker products around the world, but some of its customers are under NDA.
It’s also testing its technology with chip makers and big-name kids’ brands here in the U.S.
On stage, the company also showed off its latest development – dual language speech recognition technology. This is the first technology that can decode two languages in one sentence, when spoken by kids. This is an area smart speakers and their related voice technology are only now entering, within the adult market that is. For example, Google Assistant is preparing to become multilingual in English, French and German this year.
Currently, the company has approximately $1.2 million in revenue from customers on annual contracts and its SaaS model. It’s been operating in stealth mode, but is now preparing to reach more customers.
To date, Kadho has raised $2.5 million from investors including Plug and Play Tech Center, Beam Capital, Skywood Capital, SFK Investment, Sparks Lab, and other angel investors. It’s preparing to raise an additional $3 million before moving to a Series A.