10 years ago if you heard the term “voice technology” it would have been a foreign concept. The idea of using voice in the warehouse had already broken into the market, but it wasn’t something the average person was aware of. Now, voice regularly touches every piece of our life. Need directions? Your GPS will tell you where to go. Want to know what the weather will be like today? Simply ask to your Amazon Echo. Want to call your grandma? Just ask Siri to dial.
As voice has grown in popularity in many people’s personal lives, it has also become more common in the warehouse. Today, millions of warehouses across the world are running their picking processes, receiving, put-away, interleaving and more with voice. However, all voice technology is not created equal, and there are some major technological differences between the types of voice solutions available. When evaluating voice for the warehouse it is important to understand these differences, as they will be the key to achieving a successful voice implementation.
Speaker Dependent vs. Speaker Independent
With “Speaker Dependent” solutions, each user trains the system to recognize their individual language, dialect and speech patterns. This allows the “Speaker Dependent” solution to filter out environment noise increasing accurate and robust responses from the user’s voice. With “Speaker Independent” solutions the user does not train the system to recognize their individual language, dialect or speech patterns, but instead tries to match the user’s voice to generic, previously created voice patterns. This causes “Speaker Independent” solutions to have increased misreads of the users voice and a lesser ability to filter out environmental noise. It also performs poorly for users with accents and different languages.
Limited vs. Unlimited Vocabulary
Limited vocabulary systems provide users with a specific number of words (typically 60-90) that the system will recognize. This allows the system to easily understand the meaning of the commands, and removes most errors related to the system misunderstanding or misinterpreting. With an unlimited vocabulary system, the voice technology has to interpret what the speaker is saying and must account for slang, accents, and context, leading to errors and frustration.
Purpose Built vs. General Purpose Recognizer
When working in a warehouse, there forklifts beeping and whirring by, conveyer belts spinning, and various other noises that can often muddle a recognizers ability to hear what the speaker is saying. Purpose built solutions are made for one reason, which in this case is use within the four walls. This allows the voice recognition system to ignore these loud blasts of noise and still interpret what is being said. However, with a general purpose recognizer, common warehouse noises can be picked up and misinterpreted as a command, as they are not created to tune out the noise.
With these factors in mind you should have a better understanding of what to look for in voice technology, and be more prepared to ask the questions that will help you choose the right direction for your operation!