The Art and Business of Speech Recognition: Creating the Noble Voice

One way to create a detailed Design Specification is simply to make a table for each state that has been defined (that is, each box in the call flow diagram). By including the elements listed below (along with any other pertinent development-specific information) for any given state, the designer can make things much easier for the programmer.

  • The name of the state (and its state number).

  • The names of the state(s) that lead into it.

    For example, callers may enter the "main menu" state from the "welcome" state.

  • The names (and/or unique numbers ) of the prompts.

    The prompt types ”for example, initial, retry , timeout, and help ”refer to the various common prompts used in many speech-recognition systems (more on these prompt types and writing them will be found in Chapter 6.)

  • The text of the prompts to be played to the caller and any conditional statements about when to play them.

    For example, a particular prompt might be played only to expert or repeat callers, while a more verbose prompt would be played only to novice callers at that point in the call. For example, novice users might hear, "Enter or say the phone number, being sure to include the area code" ”compared to the expert user who hears, "What's the phone number?"

  • Any spoken words or phrases the system will need to recognize, and the touchtones that will be recognized (as a backup to the speech recognition).

    A typical system would have a list of commands, such as "Transfer funds," "Find a check," and so on. Also listed would be synonyms for these commands as well as the set of recognized touchtone equivalents.

    Command

    Synonyms (if any)

    Touchtone equivalent

    Transfer funds

    Transfer

    1

     

    Funds transfer

     

    Find a check

    None

    2

    Account Balance

    Balances

     
     

    Account information

    3

  • The go to statements for each recognition.

    For example, when the system recognizes a banking caller who says "Transfer funds" or presses the appropriate touchtone key, it should "go to" a state named "5100 transfer funds, first state, get amount."

  • Any special prompts needed to confirm recognition of a caller.

  • Special notes about the state.

    For example, "Don't let the caller interrupt the prompt when playing the confirmation message." These may also be notes to the programmers explaining how to handle special cases, like "Don't play the default apology prompt when the system makes a mistake. Instead, play prompt number 12005: "Oops, let's try again ."

Figure 5.5 shows what a typical state might look like (along with some call-out boxes, explaining the elements of the state).

Figure 5.5. Sample State Table

This diagram illustrates a state called "2100_Finance_Forex_Menu," which a caller can only enter from states called "2000_Finance_Menu." The list of potential prompts that can be played to the caller is in the prompts type section (for example, initial prompts are played when a caller first enters the state, while a timeout prompt is played only if the system doesn't hear any information), the name of the prompt (in this case the names are numbers in which the first four digits are the same as the number of the state and the last digit is unique). Next to the name of the prompt, the actual words of the prompt are written, as in prompt 21001, "For which currency would you like to hear the exchange rate?"

Below the Prompts section is the section that describes what the recognizer is listening for, either the currency amount or a command. The currency amount is written in brackets to indicate that a complete definition can be found in an appendix; this term would be too complex to define in this one section and may be used in several places in the application. In addition to the command, any synonyms for that command are included, as in "List all currencies" and "List them." The touchtone equivalents are written in corresponding cells that can be used in place of spoken commands. Next to those cells is the action that should be taken when the caller responds with an appropriate response. There is also a section for notes on the bottom (in this example it's called "Module Settings") that can be used to describe any other settings, such as those that are unique to the technology used.

Категории