After decades of jokes about her intelligence, Barbie is about to become one of the smartest dolls in the world.
Later this year, toymaker Mattel is planning to introduce Hello Barbie, a version of the doll embedded with a speech recognition system called PullString. The technology, developed by San Francisco-based ToyTalk, runs on a custom-built artificial intelligence engine tuned to the vocal tics, unusual inflections, and varied interests of children. When kids ask their new Barbie a question, she will actually listen, and answer. "There are thousands and thousands of lines Barbie can speak, leading to really meaningful and complex conversations," says ToyTalk Chief Technical Officer Martin Reddy.
Hello Barbie will have some competition in the intelligent toy space. Startup Elemental Path is developing a similar conversational technology, currently embedded in a fluorescent dinosaur and powered by IBM’s Watson computing system. Whether or not the friendly dino outsmarts the iconic blonde, the core systems behind the pair could lead to a new generation of interactive toys.
The toys work on the same basic principles, and Reddy notes his company could not simply import the speech recognition technology that powers applications like Siri and Cortana, which were trained on adults. "No one has ever built models for children," says Reddy. "They have a higher pitch. They may stutter and repeat things. The vocabulary of kids is different."
As a result, ToyTalk had to build a library of child-spoken words and phrases, and to collect that data in accordance with privacy regulations. After years of effort, and thanks in part to its interactive online games and apps, ToyTalk now has tens of millions of utterances on file.
When a child presses a button on Hello Barbie, a microphone inside the doll will begin recording audio and start streaming that clip, via Wi-Fi, to the cloud. Immediately, ToyTalk’s audio recognition engine attempts to guess which words the child seems to be saying. These possibilities are then fed into a language engine that matches the candidates with its own library of words, phrases, questions, and statements. The choices are refined as the child continues speaking. Once the system has a probable match, ToyTalk’s artificial intelligence engine picks an appropriate response from a curated list and streams it to the Barbie, which plays the phrase back to the child.
Delivering a sensible and engaging response is only part of the challenge; Reddy explains these toys also need to be fast. If Barbie were to wait too long before answering, the conversation would not feel natural. To avoid a delay, PullString starts processing the streamed audio right after a child begins speaking. "As you’re talking, we’re recognizing, and generating partial hypotheses," Reddy explains. "'I haven’t heard everything, but I think you’re saying this so far.' Then when you take your finger off the button, we send the last bit of data and finish off the hypothesis." The ideal response is selected, and Barbie responds a mere half-second after the child has finished speaking.
Elemental Path is building its own audio library, while also drawing on content from additional sources. Elemental is more focused on educational play, so a child can even ask the dinosaur why the sky is blue. The system will draw from its content library to generate a response, and while the dinosaur’s answer today is highly technical, Elemental co-founder Donald Coolidge explains that the team is crafting its content to be kid-friendly and age-appropriate. Eventually, a four-year-old might get a different answer to that kind of scientific question than a seven-year-old would.
Still, Coolidge observes it would be unwise to try and answer every possible query. "Kids ask crazy questions," he says. "Recently I heard a child ask, ‘why is a goat a goat and a chicken a chicken?’ Imagine a machine trying to answer that." Instead of delving into philosophy, the dinosaur might respond, "Hmmm, that’s a good question. Why don’t you ask me that later?" Such questions are flagged in Elemental’s system; if they come up often enough, the team will craft an answer and add that content to the system.
The technologies behind the dinosaur—or Hello Barbie—could be used in any number of toys, and Coolidge says growth will only improve their accuracy. "The more people use it, the better it’s going to become," he says.
In some ways, though, the toys may already be smart enough to pass for intelligent beings. Consider what happens when Elemental’s dinosaur is told it sounds like a certain green, diminutive Jedi master. "That’s a compliment," the friendly toy replies in its slightly crackling, gruff tone. "Wise he is. Yes."
Gregory Mone is a Boston, MA-based writer and the author of the novel Dangerous Waters.
No entries found