Speech Recognition

Speech recognition is a lot like IVR, only callers get to speak selections rather than press corresponding numbers on their phone pads to get information.

Speech recognition gives callers without touch tone dialing the same access to information as those with touch tone service. Not only will it satisfy these callers — but think of the population of callers who need glasses to dial. These callers won’t have to juggle their glasses with the phone pad to see the numbers they are pressing.

Although over-the-phone speech recognition still has a limited vocabulary, most systems are effective enough to allow callers to speak selections such as “sales,” “flight number 123,” “transfer cash” or “order baseball cap.”

Speech recognition technology is constantly improving. Vocabularies keep growing (which means you can program the system to understand more caller commands). It seems that almost all systems are now continuous speech.

Make sure you choose one that is indeed continuous speech. Otherwise callers will be forced to pause and wait for a beep after saying every word or number. Since it’s unnatural to speak this way, callers may be more likely to hang up or ask for a rep. There’s also an increased chance of the system not understanding every word, since it’s hard to tell speech from silence.

If you already own an IVR system and want to add speech recognition capabilities, you should check with your vendor. Many of the big manufacturers like Lucent, Syntellect and InterVoice have added speech recognition to their IVR systems.

This technology has made tremendous strides in the last few years. It promises to change the way customers interact with automated systems, broadening the range of telephony interactions.

There are two distinct kinds of speech recognition, known as speaker-dependent and speaker-independent. The two diverge wildly in the kinds of things they are good at, and the kinds of systems needed to make them run.

Call center apps necessarily focus on speaker-independent recognition. Many people will call, obviously. The human brain in the form of a receptionist can recognize a huge number of variations of the same basic input — there are literally an infinite number of ways to intone the word “hello.” What you want in a call center is a system that will respond to the likely inputs — the most common words like yes, no, stop, help, operator, etc., the digits, the letters of the alphabet, and so on.

Internationally, touch tone penetration is still very low, leaving a vast installed base of potential callers who can not access IVR. It follows that these callers are then going to be expensive to process when they come into a call center because they have to be held in queue until there’s an agent ready for them — high telecom charges from the longer than average wait, coupled with the cost of agent-service (rather than self-service).

On the downside, international call centers, particularly those that serve multiple countries, can field calls in multiple languages. If you use an IVR front-end to have the caller select their language then you by definition don’t need speech rec. These are surmountable problems that have more to do with the operation of speech rec in practice than with the underlying technology.

No comments:

More?