The Art and Business of Speech Recognition: Creating the Noble Voice

Getting Callers to Focus on the Essentials

It's important for a system to use precise language, but other design components can make it easier for callers to comprehend and use the system ”in particular, the order in which ideas are presented, and how those ideas are presented.

Presenting Information Clearly and Usefully

If we were designing a system that provides a warning message, we would want the system to alert the caller first before playing the message. Here's an example of a message that gets played when a single stock-trading account is accessed by more than one person at the same time. The system needs to alert the callers that this activity could be the result of an intruder.

"Sorry, but the system is alerting me that there is another person accessing this account right now. I've alerted the system administrator, and for security protection, I'm disabling some functionality for this account (in case it isn't you on another phone), such as making a trade, reporting balances , and other things. However, you can still get real-time quotes and news updates."

By designing the system this way, we enable callers to decide whether they want to focus on the warning message and act upon it, or simply let it play so they can move on.

Presenting Information in a Meaningful Order

Some people apparently don't know the difference between essential and nonessential information. I'm sure we've all been at social gatherings where someone has trapped us in a corner with the promise of a fascinating anecdote, only to see it turn into a minute-by-minute, detailed account of his or her day. This forces us to try to filter out the unimportant data on the fly ”which can be exhausting (and often fruitless). We feel like yelling, "Get to the point ”or let me out of here!"

The same holds true for speech-recognition systems. Even if we think it's obvious what's important, we can't assume that callers will be able to filter those nuggets out of the rock pile. We can see how we modify our behavior in real life ”as when I talk to my grandmother.

If I said to her:

"I walked past the yellow house on Main Street, and the first store on the right, which is a shoe store, has a pair of shoes in the window costing $45 that I think you'd really like."

she might give every detail of that account equal weight. She'd probably ask questions about the yellow house or its position on Main Street in order to clarify things in her mind, instead of focusing on the part I wanted her to ”the inexpensive shoes. I should say:

"Grandma, I saw an inexpensive pair of shoes I think you would really like! If you're interested, just walk to the shoe store on the right side of Main Street (close to the yellow house) and look in the window for shoes marked $45."

This construction uses the first sentence to set the context, and the second sentence to focus the details on how to achieve the goal. If Grandma didn't want a new pair of shoes, she could completely disregard the second sentence , knowing that it was only there to support the first.

Generally, the most important information should be presented first, and by important we mean information that is either the most critical or most descriptive of the context.

Here's an example of how a designer could make a mistake in a speech-recognition system.

"Flight 534 departing from Chicago O'Hare today, Thursday, July 5th from gate B9 in terminal 1, concourse B, is currently scheduled to depart at 8:45 P.M., on time."

In this example the most important information ”the status and time ”is buried deeply in the prompt. A better approach would be to guide the caller's focus to the most important information at or close to the beginning of the statement, hierarchically, so that if the information isn't immediately relevant to them they can ignore the rest of the statement.

"Flight 534 is scheduled to depart on time, at 8:45 P.M. from Chicago O'Hare today, Thursday, July 5th from terminal 1, concourse B, gate B9."

All callers need to know the status of the flight, and secondly, the departure time. Callers who are frequent flyers probably know the terminal number of the airline, and potentially even the concourse, then just read the gate information from the monitors . The information they need to know when calling (perhaps on their way to the airport) is whether the flight is on time. It's not good to bury important information deep inside the prompt, particularly if the plane is going to be delayed for four hours, in which event the gate information has a higher probability of changing.

These examples are a good way to understand some of the tricky elements of designing effective systems. However, it's necessary to get all the elements together to form the Design Specification from which the actual system will be produced.

Категории