The Art and Business of Speech Recognition: Creating the Noble Voice

Congruence of Style

From the brainstorming and ideation a single concept will begin to form. The designer narrows the field of ideas, and eliminates bells and whistles if they don't all work as a cohesive unit. A solid concept ensures that the design ideas work well for the audience. This continuity will make sure that the application feels like a single, integrated concept, rather than several disparate ideas stuck together.

Building trust and loyalty requires congruence in both interaction style and language. If we were designing a large system where we thought one technique for navigating through a large list was appropriate in one context while a different technique was more appropriate in another, callers could become confused because they would expect that list navigation should have a standard behavior throughout the entire application.

A feeling of discontinuity can drive a wedge between companies and their customers. Let's say a designer decided to use very casual language for one part of a banking application ”perhaps omitting transaction confirmation numbers to make the system feel smoother. If the designer then opted to use very formal instructions for another part of the application ”say, using a stern tone in directing the caller to write down and save important information ”it wouldn't feel right. It would be like having a casual conversation with a friend when, suddenly in mid-conversation, the friend started speaking in legal terms. Incidentally, it works the other way, too. Most people who are accustomed to having, say, their brokerage firms speech-recognition system address them in a strict, formal tone would not appreciate it if the employees suddenly became casual and chummy.

Imagine that you were a surfer who went into a surf shop every week for several years , knew the owner well, and always paid for your surfboards and gear at the time of purchase. Suppose you visited the shop one day to buy a 50 bar of wax, but discovered that you left your money at home. What if, instead of treating you as a valued customer, the owner became stern and flatly refused to give you the wax until you came back with the 50 ? You wouldn't like that the style and tone of the interaction had changed, and the same holds true for how speech-recognition system callers need to treat callers.

Consistency

Consistency, used correctly, ensures ease of use. The problem is, designs inevitably change when a new designer takes over or augments a design ”or even when designs are created over a long period of time and a new designer inadvertently changes the way they handle certain things. Left unchecked, this can make a design seem "choppy" or disconnected.

Consistency can be embodied in many ways.

  • A consistent use of language to refer to similar ideas, or to allow callers to say the same commands in multiple places to elicit the same responses from the system (that is, if callers can say "Help" in one context to receive additional information, it would be inconsistent to require them to say "More information" in a different context).

  • A consistent feedback structure can be manifested in the method the system employs to give feedback to callers indicating when it didn't understand them, or when it didn't hear them ”the same way each time.

  • A consistent use of audio effects can indicate that an action has been completed or that a caller has reached a particular point in an application.

Natural Flow

A good design needs to flow elegantly from one moment to the next so that callers can retain their sense of context and understand where they are in the call.

Let's say a design required a series of short questions. We wouldn't want to write each question to sound like a brand-new thought; rather we want the questions to sound like the continuation of a conversation both textually and in intonation . The following examples illustrate how to do this.

Instead of asking:

"On what date are you picking up the car?"

and then asking:

"At what time are you picking up the car?"

it would be much more natural and within the context to ask:

"On what date are you picking up the car?"

and then ask:

"And at what time?"

The second exchange would sound very conversational, and would seem (and be) faster than the first. Since the context of "picking up the car" has been set, there is no need to repeat that in the subsequent question since the question is also subsumed under the first context ”that is, it asks for a further clarification of the moment that the person will pick up the car.

Instead of having the system say:

"Please enter or say your account number" ” where the word "your" might refer to the "account number the person has access to" (a common task for brokers to do for many of their clients )

and then follow with:

"I'm accessing your information" ” where the word "your" refers to the caller's information, rather than the information associated with the account number

it would be better to have the system reply:

"I'm accessing the account information" ” where by using the word "the" we get around a potentially incorrect statement.

It's important to remember that novice callers aren't simply using the system; they're also inductively learning it at the same time. The more consistent lists of options are, the easier it is for callers to learn the system. The sequence of commands in a prompt should only be altered if there is a compelling reason to do so.

Instead of asking:

"Would you like to 'Book the flight,' 'Change the itinerary ,' or 'Talk to a representative?'"

and then in other contexts, relocate the "Talk to a representative" command by asking:

"Would you like to 'Make another reservation,' 'Talk to a representative,' or 'Review your account information?'

it would be better to make sure that similar types of commands (particularly ones that spare users from frustration, such as "Talk to a representative") always appear in the same place in a list of commands. This way, users can quickly identify where they are in the prompt.

Consistency and naturalness also make the designer's job easier. The same language can be cut and pasted from one place to another when it needs to be repeated.

Категории