Review of personal voice assistants and its capabilities in home automation
Early in the 2000s when I worked with Clipsal Smart home system, voice control was used in projects of building automation. When I studied in the UK I was brought to a retirement home, where a system of voice recognition was realized in wards for people with a mobility impairment. It was used to control the main parameters of the room. Even TV channels were changed with a voice. Then I thought that this technology would get into every home in about 2-3 years.
15 years has gone and only now there appeared something that can listen to us. Let’s see, what it is.
Our company is entering the market of gateways for Homekit, Google Assistant / Home and Amazon Alexa. Of course, we studied the market well enough. And now we want to give you a short summary of what is going on, in our opinion. It is rather a review than a profound research or the ultimate truth. We’ll be happy if you share your ideas and remarks!
Speech Recognition is Not Enough
Clipsal sold Homespeak product. It used Dragon NaturallySpeaking platform by Nuance and after a short teaching it could recognize worlds quite well. I even mastered correct pronunciation during testing. But it was always simpler to press a button.
It turned out that simple voice recognition was not enough. Something else was required. Samsung phones have S Voice or Voice Assistant. But who uses them?
To win the hearts and minds of users intellect had to be added. A lot of intellect! Acceptable voice recognition comes to us as part of a more complicated product – voice assistants.
It’s always better to look into the future via art. The things that touch our emotions, that attract us and make us wish them are later on realized in real life by talented entrepreneurs. In “Her” (2013) future is shown to us like on the palm of the hand with the help of Scarlett Johansson’s voice. (I know everybody has seen it. It is simply pleasant to remember).
As it turns out, assistants, that all of us will use in the future, are the first mass product of artificial intelligence.
All personal assistants that take a major place on the market are in fact question-answer systems with elements of artificial intelligence and natural speech processing. So, who do we talk with?
When they say that Apple is about to lose its leading position because of the falling price for the production of neuron network devices, analytics and artificial intelligence (naming Google, Amazon, IBM and Facebook leaders) they are not quite right. As Siri is in fact created in the organisation that some time ago created Internet itself. Apple bought Siri in 2010 and got the result of 40-year research and development financed by DARPA via SRI International. The product uses results of research groups from the Carnegie Mellon University, Massachusetts University, Rochester University, Florida Institute for Human & Machine Cognition, University of Oregon, University of Southern California and Stanford University (source). It will be, to say the least, strange if we don’t get a really intellectual helper with such a background.
Nuance technologies are used to recognize natural speech. (There have been no revolutionary changes since Homespeak) And at the moment it is the only helper that understands Russian.
Siri is a fully-functional personal helper, it can keep up a dialogue with a user by asking additional questions. If your iPad is in the sitting room and it is always plugged in to power supply (as it is often done in smart installations), it will works just like Amazon Echo or Google Home with a phrase, “Hello, Siri”.
Amazon: Echo, Dot and Alexa
Alexa is a personal assistant by Amazon. Echo is a smart wireless acoustic system connected to a service. Echo has a smaller version – Echo Dot. It’s a small inexpensive mike with a simple speaker for voice answers. It has a 3.5 mm jack and bluetooth to connect to a third-party sound system. There is also Echo Tap, a mobile version on batteries and Echo Show with a screen and video calls.
Here it is! The first big hit of users’ IoT. More than 8 mln pieces had been sold by January 2017 and about 20 mln are likely to be sold till the end of 2017. Keep in mind that at the moment it supports only English.
Google: Now, Home and Assistant
Now was launched in 2012 as a voice assistant and a competitor to Siri. But in fact it is not a personal assistant as it can’t keep up a dialogue. It is a prerogative of the following stage of the development of a service that is built on some phones and Google Assistant app and Google Home wireless speaker.
The best that Google Home has is orientation in the context. You can say, “Ok, Google, turn off the light in the bedroom”. And in some time you can add, “Ok, Google, set the temperature to 21”. And it changes the temperature in the bedroom whereas Siri and Alexa can’t understand it or ask a question. Unfortunately, it can’t be said in Russian. But there is a chance that soon it will change.
In 2013 Microsoft released their version of a personal assistant. It had been developed in Speech team since 2009. Now it supports several languages and Russian is not supported. As there are no successful mass devices by Microsoft it is the least interesting for home automation.
For example, a Cube by Juriy Berov. It works with context.
The funniest thing is to make them talk to one another.
How do They Work with Smart Home?
Of course, manufactures of all personal assistants think about entering home automation market that is about to become the largest global market in a couple of years.
Apple is quite conservative here, and many people like it. Siri recognizes commands in the cloud, then returns the result and then controls devices locally via Homekit. So, the control process is done in the local net.
We met with Apple representatives in Amsterdam, and they sadly informed us that they were not planning to certify gates but for several quite closed devices. They say that they have to be absolutely sure what devices are connected to the gate. Our approach is absolutely different. The number of devices is constantly growing, so we put up with a warning when iRidium gate is connected for the first time.
Amazon and Google have a different approach. You have to use apps (Echo calls them skills), that process requests in the cloud and do required tasks. The app communicates with outer clouds and apps of the IFTTT type in them.
Are many devices compatible with them? – Several hundred. If we look at the global market, it is about 15 per cent. Some people may say that the remaining 85 per cent is old-fashioned and they will soon leave the market. In fact, no.
Part of 85 per cent is start-ups that do not want to spend money on compatibility. Others are professional automation systems that do not want to lose their place on the market. Have a look at the Chinese market, as an example.
As you can see, our platform is an integral part of the market, a market with desirable voice control devices but no compatibility with a bigger part of the market. And even more the compatibility is not expected.
All we have to do is to wait till voice assistants, but for Siri, start to understand Russian and become at least a little bit smarter.
Development director, iRidium Ltd.