Avaya Logo

Previous Topic

Next Topic

Book Contents

Book Index

Recognizing caller speech

Within the context of interactive voice response (IVR) systems and applications, the term speech recognition, sometimes also called automatic speech recognition or advanced speech recognition (ASR), is the ability of an IVR system to recognize spoken responses from a caller and either convert the responses to text or use the results to initiate some system action. On Avaya IR systems today, several different forms of speech recognition exist, including the following:

These forms of ASR are good at what they do, and in many applications they are quite sufficient. They do, however, share some limitations, both in the number of words or phrases that can be recognized, and also in their inability to take into account grammatical sentence structure. While these ASR technologies can recognize specific words or phrases, even when extraneous words or phrases are added by the caller, they cannot recognize what part the recognized speech plays in the overall statement. In other words, these ASR technologies are all designed to recognize specific words or phrases, but not to interpret what they recognize.

Natural language speech recognition (NLSR) takes the speech recognition process several steps further by providing a more natural conversational interface with IVR systems. Not only can NLSR be used to recognize particular words and phrases, it can also interpret and assign meaning to the speech it recognizes.

For example, under the more basic forms of ASR, a caller can respond only to specific prompts, such as "Say `one' if you want information about..." or "Say `yes' if this is correct." NLSR enables you to write applications that ask the caller more open-ended questions, such as a banking application that presents the caller with a list of options and asks "What would you like to do?" When the caller responds "I'd like to know the balance of my checking account, please," the system can recognize the kind of information the caller wants (the balance in a checking account ) and can automatically direct the call to a new prompt that asks for the caller's checking account number. This technology provides a more natural way of interacting with callers.

It is worth noting that NLSR is also able to take into account grammatical structures. This allows it, for instance, to recognize and deal appropriately with differences in statements like the following caller responses:

"I would like to fly from Chicago to LAX."

"I need to get from LAX to Chicago."

NLSR is also capable of understanding natural numbers ("seventy-six" instead of "seven six"), natural dates ("July 26th" instead of "zero seven two six") and natural currency ("25 dollars" instead of "two five zero zero").

Because of the relatively complex nature of NLSR, it requires the use of larger vocabularies and grammars. For this reason, it requires a stand-alone recognition server to do the speech recognition. The Avaya IR system communicates with the recognition server using a proxy interface to support the NLSR feature.

� 2002 Avaya Inc. All Rights Reserved.