Microsoft Agent

Microsoft Agent is a technology used to add interactive animated characters to Windows applications or Web pages. Microsoft Agent characters can speak and respond to user input via speech recognition and synthesis. Microsoft employs its Agent technology in applications such as Word, Excel and PowerPoint. Agents in these programs aid users in finding answers to questions and in understanding how the applications function.

The Microsoft Agent control provides programmers with access to four predefined charactersGenie (a genie), Merlin (a wizard), Peedy (a parrot) and Robby (a robot). Each character has a unique set of animations that programmers can use in their applications to illustrate different points and functions. For instance, the Peedy character-animation set includes different flying animations, which the programmer might use to move Peedy on the screen. Microsoft provides basic information on Agent technology at

www.microsoft.com/msagent

Microsoft Agent technology enables users to interact with applications and Web pages through speech, the most natural form of human communication. To understand speech, the control uses a speech recognition enginean application that translates vocal sound input from a microphone to language that the computer understands. The Microsoft Agent control also uses a text-to-speech engine, which generates characters' spoken responses. A text-to-speech engine is an application that translates typed words into audio sound that users hear through headphones or speakers connected to a computer. Microsoft provides speech recognition and text-to-speech engines for several languages at

www.microsoft.com/msagent/downloads/user.asp

Programmers can even create their own animated characters with the help of the Microsoft Agent Character Editor and the Microsoft Linguistic Sound Editing Tool. These products are available free for download from

www.microsoft.com/msagent/downloads/developer.asp

This section introduces the basic capabilities of the Microsoft Agent control. For complete details on downloading this control, visit

www.microsoft.com/msagent/downloads/user.asp

The following example, Peedy's Pizza Palace, was developed by Microsoft to illustrate the capabilities of the Microsoft Agent control. Peedy's Pizza Palace is an online pizza shop where users can place their orders via voice input. The Peedy character interacts with users by helping them choose toppings and calculating the totals for their orders. You can view this example at

agent.microsoft.com/agent2/sdk/samples/html/peedypza.htm

To run the example, you must go to www.microsoft.com/msagent/downloads/user.asp and download and install the Peedy character file, a text-to-speech engine and a speech-recognition engine.

When the window opens, Peedy introduces himself (Fig. 17.28), and the words he speaks appear in a cartoon bubble above his head. Notice that Peedy's animations correspond to the words he speaks.

Figure 17.28. Peedy introducing himself when the window opens.

(This item is displayed on page 857 in the print version)

Programmers can synchronize character animations with speech output to illustrate a point or to convey a character's mood. For instance, Fig. 17.29 depicts Peedy's Pleased animation. The Peedy character-animation set includes eighty-five different animations, each of which is unique to the Peedy character.

Figure 17.29. Peedy's Pleased animation.

(This item is displayed on page 857 in the print version)

Look and Feel Observation 17 1

Agent characters remain on top of all active windows while a Microsoft Agent application is running. Their motions are not limited by the boundaries of the browser or application window.

Peedy also responds to input from the keyboard and mouse. Figure 17.30 shows what happens when a user clicks Peedy with the mouse pointer. Peedy jumps up, ruffles his feathers and exclaims, "Hey, that tickles!" or "Be careful with that pointer!" Users can relocate Peedy on the screen by dragging him with the mouse. However, even when the user moves Peedy to a different part of the screen, he continues to perform his preset animations and location changes.

Figure 17.30. Peedy's reaction when he is clicked.

Many location changes involve animations. For instance, Peedy can hop from one screen location to another, or he can fly (Fig. 17.31).

Figure 17.31. Peedy flying animation.

Once Peedy completes the ordering instructions, a tool tip appears beneath him indicating that he is listening for a voice command (Fig. 17.32). You can enter the type of pizza to order either by speaking the style name into a microphone or by clicking the radio button corresponding to your choice.

Figure 17.32. Peedy waiting for speech input.

If you choose speech input, a box appears below Peedy displaying the words that Peedy "heard" (i.e., the words translated to the program by the speech-recognition engine). Once he recognizes your input, Peedy gives you a description of the selected pizza. Figure 17.33 shows what happens when you choose Seattle as the pizza style.

Figure 17.33. Peedy repeating a request for Seattle-style pizza.

(This item is displayed on page 860 in the print version)

Peedy then asks you to choose additional toppings. Again, you can either speak or use the mouse to make a selection. Checkboxes corresponding to toppings that come with the selected pizza style are checked for you. Figure 17.34 shows what happens when you choose anchovies as an additional topping. Peedy makes a wisecrack about your choice.

Figure 17.34. Peedy repeating a request for anchovies as an additional topping.

(This item is displayed on page 860 in the print version)

You can submit the order either by pressing the Place My Order button or by speaking "Place order" into the microphone. Peedy recounts the order while writing down the order items on his notepad (Fig. 17.35). He then calculates the figures on his calculator and reports the total price (Fig. 17.36).

Figure 17.35. Peedy recounting the order.

Figure 17.36. Peedy calculating the total.

 

Creating an Application That Uses Microsoft Agent

[Note: Before running this example, you must first download and install the Microsoft Agent control, a speech-recognition engine, a text-to-speech engine and the four character definitions from the Microsoft Agent Web site, as we discussed at the beginning of this section.]

The following example (Fig. 17.37) demonstrates how to build a simple application with the Microsoft Agent control. This application contains two drop-down lists from which the user can choose an Agent character and a character animation. When the user chooses from these lists, the chosen character appears and performs the selected animation. The application uses speech recognition and synthesis to control the character animations and speechyou can tell the character which animation to perform by pressing the Scroll Lock key, then speaking the animation name into a microphone.

Figure 17.37. Microsoft Agent demonstration.

1 // Fig. 17.28: Agent.cs 2 // Microsoft Agent demonstration. 3 using System; 4 using System.Collections; 5 using System.Windows.Forms; 6 using System.IO; 7 8 public partial class Agent : Form 9 { 10 // current agent object 11 private AgentObjects.IAgentCtlCharacter speaker; 12 13 // default constructor 14 public Agent() 15 { 16 InitializeComponent(); 17 18 // initialize the characters 19 try 20 { 21 // load characters into agent object 22 mainAgent.Characters.Load( "Genie", 23 @"C:windowsmsagentcharsGenie.acs" ); 24 mainAgent.Characters.Load( "Merlin", 25 @"C:windowsmsagentcharsMerlin.acs" ); 26 mainAgent.Characters.Load( "Peedy", 27 @"C:windowsmsagentcharsPeedy.acs" ); 28 mainAgent.Characters.Load( "Robby", 29 @"C:windowsmsagentcharsRobby.acs" ); 30 31 // set current character to Genie and show him 32 speaker = mainAgent.Characters[ "Genie" ]; 33 GetAnimationNames(); // obtain an animation name list 34 speaker.Show( 0 ); // display Genie 35 characterCombo.SelectedText = "Genie"; 36 } // end try 37 catch ( FileNotFoundException ) 38 { 39 MessageBox.Show( "Invalid character location", 40 "Error", MessageBoxButtons.OK, MessageBoxIcon.Error ); 41 } // end catch 42 } // end constructor 43 44 // event handler for Speak Button 45 private void speakButton_Click( object sender, EventArgs e ) 46 { 47 // if textbox is empty, have the character ask 48 // user to type the words into the TextBox; otherwise, 49 // have the character say the words in the TextBox 50 if ( speechTextBox.Text == "" ) 51 speaker.Speak( 52 "Please, type the words you want me to speak", "" ); 53 else 54 speaker.Speak( speechTextBox.Text, "" ); 55 } // end method speakButton_Click 56 57 // event handler for Agent control's ClickEvent 58 private void mainAgent_ClickEvent( 59 object sender, AxAgentObjects._AgentEvents_ClickEvent e ) 60 { 61 speaker.Play( "Confused" ); 62 speaker.Speak( "Why are you poking me?", "" ); 63 speaker.Play( "RestPose" ); 64 } // end method mainAgent_ClickEvent 65 66 // ComboBox changed event, switch active agent character 67 private void characterCombo_SelectedIndexChanged( 68 object sender, EventArgs e ) 69 { 70 ChangeCharacter( characterCombo.Text ); 71 } // end method characterCombo_SelectedIndexChanged 72 73 // utility method to change characters 74 private void ChangeCharacter( string name ) 75 { 76 speaker.StopAll( "Play" ); 77 speaker.Hide( 0 ); 78 speaker = mainAgent.Characters[ name ]; 79 80 // regenerate animation name list 81 GetAnimationNames(); 82 speaker.Show( 0 ); 83 } // end method ChangeCharacter 84 85 // get animation names and store in ArrayList 86 private void GetAnimationNames() 87 { 88 // ensure thread safety 89 lock ( this ) 90 { 91 // get animation names 92 IEnumerator enumerator = mainAgent.Characters[ 93 speaker.Name ].AnimationNames.GetEnumerator(); 94 95 string voiceString; 96 97 // clear actionsCombo 98 actionsCombo.Items.Clear(); 99 speaker.Commands.RemoveAll(); 100 101 // copy enumeration to ArrayList 102 while ( enumerator.MoveNext() ) 103 { 104 // remove underscores in speech string 105 voiceString = ( string ) enumerator.Current; 106 voiceString = voiceString.Replace( "_", "underscore" ); 107 108 actionsCombo.Items.Add( enumerator.Current ); 109 110 // add all animations as voice enabled commands 111 speaker.Commands.Add( ( string ) enumerator.Current, 112 enumerator.Current, voiceString, true, false ); 113 } // end while 114 115 // add custom command 116 speaker.Commands.Add( "MoveToMouse", "MoveToMouse", 117 "MoveToMouse", true, true ); 118 } // end lock 119 } // end method GetAnimationNames 120 121 // user selects new action 122 private void actionsCombo_SelectedIndexChanged( 123 object sender, EventArgs e ) 124 { 125 speaker.StopAll( "Play" ); 126 speaker.Play( actionsCombo.Text ); 127 speaker.Play( "RestPose" ); 128 } // end method actionsCombo_SelectedIndexChanged 129 130 // event handler for Agent commands 131 private void mainAgent_Command( 132 object sender, AxAgentObjects._AgentEvents_CommandEvent e ) 133 { 134 // get UserInput object 135 AgentObjects.IAgentCtlUserInput command = 136 ( AgentObjects.IAgentCtlUserInput ) e.userInput; 137 138 // change character if user speaks character name 139 if ( command.Voice == "Peedy" || command.Voice == "Robby" || 140 command.Voice == "Merlin" || command.Voice == "Genie" ) 141 { 142 ChangeCharacter( command.Voice ); 143 return; 144 } // end if 145 146 // send agent to mouse 147 if ( command.Voice == "MoveToMouse" ) 148 { 149 speaker.MoveTo( Convert.ToInt16( Cursor.Position.X - 60 ), 150 Convert.ToInt16( Cursor.Position.Y - 60 ), 5 ); 151 return; 152 } // end if 153 154 // play new animation 155 speaker.StopAll( "Play" ); 156 speaker.Play( command.Name ); 157 } 158 } // end class Agent

The example also allows you to switch to a new character by speaking its name and creates a custom command, MoveToMouse. In addition, when you press the Speak Button, the characters speak any text that you typed in the TextBox.

To use the Microsoft Agent control, you must add it to the Toolbox. Select Tools > Choose Toolbox Items... to display the Choose Toolbox Items dialog. In the dialog, select the COM Components tab, then scroll down and select the Microsoft Agent Control 2.0 option. When this option is selected properly, a small check mark appears in the box to the left of the option. Click OK to dismiss the dialog. The icon for the Microsoft Agent control now appears at the bottom of the Toolbox. Drag the Microsoft Agent Control 2.0 control onto your Form and name the object mainAgent.

In addition to the Microsoft Agent object mainAgent (of type AxAgent) that manages the characters, you also need a variable of type IAgentCtlCharacter to represent the current character. We create this variable, named speaker, in line 11.

When you execute this program, class Agent's constructor (lines 1442) loads the character descriptions for the predefined animated characters (lines 2229). If the specified location of the characters is incorrect, or if any character is missing, a FileNotFoundException is thrown. By default, the character descriptions are stored in C:Windowsmsagentchars. If your system uses another name for the Windows directory, you'll need to modify the paths in lines 2229.

Lines 3234 set Genie as the default character, obtain all animation names via our utility method GetAnimationNames and call IAgentCtlCharacter method Show to display the character. We access characters through property Characters of mainAgent, which contains all characters that have been loaded. We use the indexer of the Characters property to specify the name of the character that we wish to load (Genie).

Responding to the Agent Control's ClickEvent

When a user clicks the character (i.e., pokes it with the mouse), event handler mainAgent_ClickEvent (lines 5864) executes. First, speaker method Play plays an animation. This method accepts as an argument a string representing one of the predefined animations for the character (a list of animations for each character is available at the Microsoft Agent Web site; each character provides over 70 animations). In our example, the argument to Play is "Confused"this animation is defined for all four characters, each of which expresses this emotion in a unique way. The character then speaks, "Why are you poking me?" via a call to method Speak. Finally, we play the RestPose animation, which returns the character to its neutral, resting pose.

Obtaining a Character's List of Animations and Defining Its Commands

The list of valid commands for a character is contained in property Commands of the IAgentCtlCharacter object (speaker, in this example). The commands for an Agent character can be viewed in the Commands pop-up window, which displays when the user right-clicks an Agent character (the last screenshot in Fig. 17.37). Method Add of property Commands adds a new command to the command list. Method Add takes three string arguments and two bool arguments. The first string argument identifies the name of the command, which we use to identify the command programmatically. The second string defines the command name as it appears in the Commands pop-up window. The third string defines the voice input that triggers the command. The first bool specifies whether the command is active, and the second bool indicates whether the command is visible in the Commands pop-up window. A command is triggered when the user selects the command from the Commands pop-up window or speaks the voice input into a microphone. Command logic is handled in the Command event handler of the AxAgent control (mainAgent, in this example). In addition, Agent defines several global commands that have predefined functions (for example, speaking a character name causes that character to appear).

Method GetAnimationNames (lines 86119) fills the actionsCombo ComboBox with the current character's animation listing and defines the valid commands that can be used with the character. The method contains a lock block to prevent errors resulting from rapid character changes. The method uses an IEnumerator (lines 9293) to obtain the current character's animations. Lines 9899 clear the existing items in the ComboBox and the character's Commands property. Lines 102113 iterate through all items in the animation-name enumerator. For each animation, line 105 assigns the animation name to string voiceString. Line 106 removes any underscore characters (_) and replaces them with the string "underscore"; this changes the string so that a user can pronounce and employ it as a command activator. Line 108 adds the animation's name to the actionsCombo ComboBox. The Add method of the Commands property (lines 111112) adds a new command to the current character. In this example, we add every animation name as a command. Each call to Add receives the animation name as both the name of the command and the string that appears in the Commands pop-up window. The third argument is the voice command, and the last two arguments enable the command but indicate that it is not available via the Commands pop-up window. Thus, the command can be activated only by voice input. Lines 116117 create a new command, named MoveToMouse, which is visible in the Commands pop-up window.

Responding to Selections from the actionsCombo ComboBox

After the GetAnimationNames method has been called, the user can select a value from the actionsCombo ComboBox. Event handler actionsCombo_SelectedIndexChanged (lines 122128) stops any current animation, then plays the animation that the user selected from the ComboBox, followed by the RestPose animation.

Speaking the Text Typed by the User

You can also type text in the TextBox and click Speak. This causes event handler speakButton_Click (line 4555) to call speaker method Speak, supplying as an argument the text in speechTextBox. If the user clicks Speak without providing text, the character speaks, "Please, type the words you want me to speak".

Changing Characters

At any point in the program, the user can choose a different character from the charactersCombo ComboBox. When this happens, the SelectedIndexChanged event handler for characterCombo (lines 6771) executes. The event handler calls method ChangeCharacter (lines 7483) with the text in the characterCombo as an argument. Method ChangeCharacter stops any current animation, then calls the Hide method of speaker (line 77) to remove the current character from view. Line 78 assigns the newly selected character to speaker, line 81 generates the character's animation names and commands, and line 82 displays the character via a call to method Show.

Responding to Commands

Each time a user presses the Scroll Lock key and speaks into a microphone or selects a command from the Commands pop-up window, event handler mainAgent_Command (lines 131157) is called. This method is passed an argument of type AxAgentObjects._AgentEvents_CommandEvent, which contains a single property, userInput. The userInput property returns an Object that can be converted to type AgentObjects.IAgentCtlUserInput. Lines 135136 assign the userInput object to an IAgentCtlUserInput object named command, which is used to identify the command, so the program can respond appropriately. Lines 139144 use method ChangeCharacter to change the current Agent character if the user speaks a character name. Microsoft Agent always will show a character when a user speaks its name; however, by controlling the character change, we can ensure that only one Agent character is displayed at a time. Lines 147152 move the character to the current mouse location if the user invokes the MoveToMouse command. Agent method MoveTo takes x- and y-coordinate arguments and moves the character to the specified screen position, applying appropriate movement animations. For all other commands, we Play the command name as an animation in line 156.

Категории