Whether you’re in Asia or in America, whether you are sitting on your toilet or opening your refrigerator, it will soon be completely normal to discuss household matters with your home appliances via vocal assistants such as Google Home, Kakao Mini, Galaxy Smart Home, Amazon Echo and their counterparts.
“OK, Google …” “Hey Kakao …” “Hi Bixby …” “How’s the weather today ?”
They can already turn the lights on and off, adjust the temperature or even flush your toilet.
One signpost to this future is likely to be high-tech early-adopter South Korea. According to a study from KT Group affiliate Nasmedia Co, about 40% of the country’s 20 million households will likely have an AI speaker – a digital assistant that uses voice recognition software as its user interface – in 2019.
The country is already well on the way. Despite its modest population of 51 million, South Korea was, according to a study by Canalys, the third biggest market for artificial intelligence (AI) speakers in the world in 2018 with 8%, following demographic giants the US with 46% and China with 20%. This number is expected to rise once popular local app Kakao gets into cars or Google Assistant partners with Samsung and LG to offer talking smart TVs.
Hard to imagine? Well, a century ago who would have assumed it would be feasible to communicate with machines via keyboards or touch screens? Who foresaw the globalized success of social media or messenger apps like Facebook, WeChat or Snapchat?
Researchers are preparing the next step. AI-powered devices take many forms – from conversational software on a computer to complex robots – and so do their interfaces. Scientists have already designed AI that could answer non-verbal commands.
The language tree
“We think Chatbots are stupid boxes that answer our questions, but they have a social presence,” explained David J Gunkel, a Northern Illinois University lecturer and the author of Robots Rights, which raised the question of robots’ social rights. “We teach children to say ‘thank you’ to Alexa or be polite with it. Those details can be viewed as inconsequential, but they’re important. Chatbots are a technology that’s very social.”
Even so, their functionalities remain somewhat mysterious to the mass public. Reactions range from positive reports on home assistants to making fun of “conversational robots” that can’t answer a question correctly on websites or messenger apps to the apocalyptic paranoia entertained by science-fiction movies and novels. Which data is stored? Are they spying on us? Will chatbots replace us? Or worse, go beyond and take control?
The two technologies – chatbots and home assistants – are cut from the same cloth. They are AIs you can talk to. A home assistant is a way to host a chatbot, like a computer or a robot would. They follow a script written in advance – they notice keywords in a sentence, take them out and compare them to their database to give out the most accurate answer. This is called “natural language processing.”
Imagine it as a tree. Each branch is a part of the script and each leaf is a subpart. The bot will not go to the top of the tree in a linear fashion. If the user suddenly changes the conversation, algorithms are today flexible to jump from one branch to another in any given order. However, the bot’s knowledge is frozen – it only has its original database to draw from – and it does not learn from its new interactions.
Long-term, however, bots like Google Home or Bixby don’t store the kind of artificial intelligence buttressed by machine learning that powers Deep Mind, the artificial intelligence developed by Google that beat the best go players in the world. Their competition is the more modest, but still self-teaching Cleverbot, a conversational AI that is becoming well known online. And that is no surprise as its goal is fascinating: chatbots may be able to pass as humans.
Talk like a human
“What makes Cleverbot different from most bots is that it learns from every conversation it has – and in every language,” said Rollo Carpenter, the bot’s creator. “You could spend an entire life looking at one day’s data.”
Cleverbot’s users discuss all kind of subjects, confide it in and also test the bot: “Who is the robot” is one of the most popular topics. “Cleverbot … learned by imitating people. People say they are humans,” said Carpenter. “Cleverbot is just turning the tables around, which usually leads to endless debates.”
The question of differentiating a human from a bot is at the core of reactions around chatbots and their acknowledgment by the mass public. The current ethical consensus is that a bot should reveal that it is a bot from the very beginning of the conversation.
This makes sense from a commercial perspective: it reassures users. It is not by chance that since their creation, chatbots have become the ultimate challenge for the Turing Test – the “imitation game” which determines whether a machine has developed a “conscience.” “It’s ironic,” said author Gunkel. “We want our AI to act as humans, but we’re terrified when they do.”
The ability of manufacturers to reassure users means the door is wide open for upgraded interactions.
“Voice and text are not enough. Humans have five senses,” noted Carpenter. “A cleverbot only uses one sense: text sequences. It cannot understand when someone says ‘I love you.’ It knows what it means in relation to all the conversations it had in the past, but it doesn’t know what it means, really.”
Move like a human
The problem for robotics engineers is that learning from all our senses results in a massively multiplied data set. That translates into costly investments, as they require the stocking, handling and ability to power an unthinkable amount of information. Hence, Wendy Ju, who lectures at Cornell Tech in New York, prefers to concentrate on only one communication vector: non-verbal cues, or movement.
She and her team created a fun experiment to explore human interactions with moving objects. They programmed a robot ottoman that would approach people and suggest they rest their feet on it. The ottoman could then suddenly withdraw its offer and retreat to a corner of the room – through non-verbal cues. The feedback from subjects was surprising. Most transposed human emotions onto the ottoman. “Oh, maybe there was someone more important than me over there?” “Maybe my feet smell?” etc.
“People see the machine and project their social logic onto it,” said Ju. “Even if there is no word at play, humans (like animals) want to engage in an exchange with the robot (whether it is anthropomorphic or not).”
To measure the potential of movement, look at another product: the Jibo robot. It was designed to replicate Google Home or Alexa in the Home Assistant market by being more relatable. Little, with all soft, rounded edges, Jibo is a robot made out of one screen-head and one eye. It moves all the time and most persons find it adorable – until it speaks with a flat, robotic-sounding voice devoid of emotion.
Sell like a human?
“The problem is that Jibo probably arrived too late and was too expensive,” said Gunkel. Created by Cynthia Breazeal, who directs the Personal Robots Group at MIT, Jibo had no access to the massive database that tech giants like Alibaba or Google have amassed.
As a result, it only had a limited set of vocal assistant functionalities. “It was sold for $899 … too expensive for an object that’s more ornamental than useful,” he adds. In December 2018, Jibo sold its shares and exited the market.
However seductive the idea of creating a technology that would combine all these functionalities into one, it could not survive in the market without commercial potential and applications. But while it’s not hard to predict money is the end game, it is harder to know when those changes will occur.
This doesn’t stop Gunkel from being positive about the future of chatbots. “We will probably see a lot of false starts. Jibo is only an example. Look at Apple: everyone knows the Macintosh today. But before the Mac, there was Lisa and it was a big failure because of a too complicated interface. Apple made it more affordable and simpler and the rest is history. Just look at what you’re holding.”