About Face 2.0(c) The Essentials of Interaction Design

One of the most noticeable attributes of text edit controls is how stupid they are. If an application calls for typing an address, for example, there are no Address controls, yet that is exactly what we need. Validation controls exist, true, but their capability to adapt to variable input, like a whole address, approaches nil. A real address control would be an example of an extraction control.

Extraction controls are a better approach to the problem of formatted data entry. Extraction controls parse the contents of a free-form text entry control according to some rules about the general class of input. For example, instead of having one field for street address, one for apartment/suite/mail stop, one for city, one for state, and one for zip code, you would instead create a single text entry control, several lines tall. The user keys in the entire desired address in the single field, just as he would on an envelope or in a rolodex, and the control makes sense of the various parts of that address. Sounds good, huh. But how do we do it?

A normal text edit control has a method (or entry point, or value, depending on your language/coding model) to examine its contents. The contents of a normal zip code entry field are whatever the user enters. An extraction control has several other content examination methods in addition to the traditional one. They would include:

Not all these values would be filled, only those that are relevant, depending on what the user enters. The control would do its best to determine which parts of the entered text belong in each category. There are basically three levels of discrimination in this process. The control would return the text verbatim as the user entered it. Each line of the address would be separated: Street address line, Second address line, City line. Then, each separate element of the address would be parsed into its appropriate category.

A control like this enables users to enter addresses the same way they manually prepare an envelope: by typing the address as a block. The computer does the work of separating the fields out for efficient categorizing in a database program. A program would then be able to, for example, sort the addresses by street name or by zip code, even though the address is entered in human-readable form.

Useful types of extraction controls include those for proper names, e-mail addresses, physical descriptions, and telephone numbers. An extraction control could easily pull a person's first name, last name, middle name, honorific, rank, and title from a single field so the user isn't forced to manually separate them at entry time.

Yes, there will be an error rate, but it won't be high and it won't be significant. An address-parsing algorithm can easily pull apart the vast majority of addresses. If someone tried to deliberately enter garbage, the extraction control would probably fail to discriminate accurately, but then again, how many users deliberately enter garbage? An end user with a shrink-wrapped application who deliberately enters garbage into his own system certainly won't blame you for the problem.

When coded into a dialog box, a telephone-number extraction control, for example, would recognize phone numbers by applying a series of simple lexical and semantic rules. The outputs of the field would consist of the raw text as entered by the user, along with an array of possible phone, fax, cellular, and pager numbers. If the control is unable to discern these numbers from the contents, well, it can't; but in most cases where these numbers are discernible by humans they are also discernible by software. Let's take an example: Say that we key this text into a phone number extraction control:

415-366-2300w, Home:367-9824 (415) 367-9976 fax 508 2031 pager

There's some pretty torturous stuff here: inconsistent and missing symbols and varying labels. But can you figure out what we've typed? Sure you can. A program could, too! The first number is a well-formed number with area code, prefix, and body. It has a w appended to it that can reasonably be interpreted as being a work phone. The comma is just a separator. The next number is prefixed by the word Home: so its nature is clear. The absence of an area code is not much of a crisis. The program could easily assume it is a 415 number—the same as all the others. If it were different, it is likely that we would have entered it. The third number is trickier. Certainly, it a well-formed number, but what is it? The word fax is ambiguous. It could be referring to the third number or the fourth number. The last word in the entry, pager, disambiguates the two because it must be referring to the fourth number, so fax must refer to the third. The lack of a hyphen in the fourth number should be no problem because the number is still a recognizable, well-formed phone number.

If we wanted to really tax the control we could enter something more problematic, like this:

4558, 1-800-555-1212 25433 555-FLIX

Well, this would certainly put a strain on things, but it is not impossible. The first number, 4558, is not a recognizable phone number, but it is a recognizable fragment of a phone number. When you want to call someone within your company through a private PBX, you often just enter a four-digit number. If the PBX's prefix were 488—which the program is likely to already know—the number from the outside would be 488–4558. The second number is a well-formed number; it's just more complete than many others. It includes the long distance prefix 1 and adds a five-digit extension. We guess that it is an extension because it is not delimited from the 800 number. If it contained only four digits, we might have trouble discriminating between it being an extension or another in-house number. The last number is, well, recognizable even though it doesn't use all-numeric digits because its form is recognizable. Software might otherwise have difficulty determining that 555-FLIX is a phone number, except that we are talking about a field that is designed to process phone numbers—that's a big hint.

Many of you are probably having trouble swallowing the idea of extraction controls. They seem to fly in the face of our tradition of guaranteed data integrity. This isn't really the case, however, as we discussed at length in Chapter 17. Besides, there's nothing preventing you from allowing your users to see (and correct if necessary) the results of the extraction parsing before committing it to the database, which is the solution Microsoft has chosen.

After the first edition of About Face was published, Microsoft has bravely implemented a few extraction controls just as described here. See for yourself in Figure 32-1.

Figure 32-1: The New Contacts dialog in Outlook includes some extraction controls; look at the Full Name and Address fields in the main dialog. If you want to check the results, you can click the Full Name or Address buttons (which double as field labels), to get the dialogs that are shown. Outlook has correctly parsed one of the author's name and address—but couldn't it all have been entered into a single field?

Категории