Getting the most out of Text-to-Speech

Avaya Logo

Getting the most out of Text-to-Speech

Use the following guidelines to improve Text-to-Speech performance in your applications.

Help callers adjust to Text-to-Speech output

Your callers may need to adjust to the computer voice of Text-to-Speech output to understand it well. To help them, you could speak out some less important information before speaking out the information callers need. This gives callers a chance to become familiar with the rhythm and intonation of Text-to-Speech.

Because the sound of Text-to-Speech may be unfamiliar to your callers, consider giving them the option of having the information repeated, or spelled, if necessary. Spelling is especially useful with names, as Text-to-Speech may not pronounce names as the caller would.

Use complete sentences

Text-to-Speech works primarily as a reading machine. It operates under the assumption that the information it is reading is structured in standard sentences (using punctuation, capitalization, subject, object, and verb). In order to make Text-to-Speech output sound most natural, use good grammar, complete sentences, and punctuation in the input text to be spoken.

What if the information you want to speak is not written in complete sentences? Since data fields cannot be punctuated, you may be able to control the output by changing the speaking rate and pauses between the information.

Pauses

You can control the pace of the Text-to-Speech output by inserting pauses. If you are speaking text that you control, the easiest way to do this is with punctuation within the words to be read. Remember to punctuate exactly like you would in a sentence (for example, do not leave a space before a period or a comma).

When speaking out a large block of text, you may hear a pause where you do not want a pause. First, check if there is any stray punctuation causing the pause. If not, you can insert a short recorded silence phrase before the sentence during which you heard the pause. This should eliminate the misplaced pause. If the text block is from a remote database, however, this may not be possible.

Eliminate typographical errors

Text-to-Speech pronounces exactly what is written, so typographical errors can cause mispronunciations. To make Text-to-Speech output as understandable as possible, look for and listen for typographical errors in the ASCII text, and remove them when you test the application.

Remember Text-to-Speech pronunciation differences

Text-to-Speech relies on built-in rules, but cannot account for all exceptions. Therefore, it may mispronounce words, especially some names. If Text-to-Speech mispronounces a word, use phonetic spelling to correct the pronunciation. For example, Text-to-Speech pronounces the name "Bagge" as "baggy," but the correct pronunciation is with a silent e. You can change the spelling of the name to "bag" so that Text-to-Speech pronounces it correctly.

Another way to overcome mispronunciation or misunderstandings is to spell some of the information, especially for names. Design the application to speak the name, then spell it out. Or, you can give callers the option of having a name spelled out.