Smart speakers represent one of the most popular entry points into the home automation space. Millions of American homes are now equipped with at least one, made by big-name and off-brand manufacturers alike. Even home security and automation providers are adding smart speakers to their packages. Yet despite the strength of the smart speaker market, not all is as it appears to be.

The truth is that smart speakers can be simultaneously helpful and annoying. They can be delightful to use one minute, and capable of filling a homeowner with utter rage the next. The culprit is something known as natural language processing (NLP).

How a Smart Speaker Works

    Vivint Home Security explains the smart speaker as a cloud-connected device with one purpose: to recognize voice commands and act upon them. Sounds simple enough, right? It is actually not. Consumers tend to think that building a smart speaker is simple because we take our own abilities to hear and understand for granted.

    When you ask a smart speaker to give you the weather forecast, it first has to capture your voice and send it over the internet to a main server for processing. The software on that server has to figure out what it is you said. Based on its understanding of your question, it must then retrieve the data and send it back. Finally, the smart speaker has to play the forecast for you.

    Smart speakers are cloud connected because of the sheer amount of processing power NLP requires. A smart speaker just doesn’t have the resources to do it on board. So data is sent to the cloud, analyzed, and acted upon.

    NLP Is Not so Easy

      The smart speaker can be delightful to use when you are only issuing simple commands. For example, statistical data shows that listening to streaming music is the thing people use their smart speakers for most often. A basic command like “play Christmas music” is pretty easy to parse. Smart speakers can handle such requests without much issue.

      It is when you get into more complicated commands that smart speakers seem to break down. That is because NLP is not so easy. We take NLP for granted because our brains have the built-in capacity to make sense of language. For instance, you can leave a word or two out of a sentence and still convey the right meaning to a listener. That listener’s brain is able to fill in the gaps. Computers cannot do that so well.

      Furthermore, even the most intelligent computer software is incapable of determining intent with any degree of accuracy. It can only analyze words, and how those words appear in relation to others around them, to figure out what a person is trying to say. And with an endless number of word combinations to work with, parsing a complex command can be challenging.

      Human Voices Are Different

        The other difficulty faced by smart speaker designers is the simple fact that voices are different. Your voice is unique compared to everyone else’s. It may sound similar to your sibling’s voices, but it is still different. Smart speakers have to account for the different types of voices, along with differences in accents, vocabularies, slang terms, etc.

        A smart speaker can be fun to use when commands are kept simple. Getting the weather forecast by asking a question is pretty convenient, while speaking to turn your lights on is a novelty. But when commands get more complicated, smart speakers can be downright annoying. It is all part of what makes smart home technology so interesting to use.

        Share.

        Comments are closed.