A Practical Guide to Testing Object-Oriented Software

Specifying the Inspection

When a guided inspection is planned, the scope and depth of the material to be inspected should be specified. The earliest models, such as requirements and domain models, may be inspected in their entirety at a single session. Later models will usually be too large to allow this. In Realistic Models (below), we talk about ways of creating modular diagrams that can be grouped into different-sized pieces. Having modular models facilitates limiting an inspection to the work of a single group or even to a specific class hierarchy.

The scope of an inspection is defined by specifying a set of use cases, a set of packages, or abstract classes/interfaces. The scope determines starting points for scenarios, but other classes are pulled into scope as they are needed to support the scenarios.

The depth of the inspection is defined by specifying layers in aggregation hierarchies under which messages are not sent. The bottom layer classes simply return values with no indication of how the value was computed.

Realistic Models

It is usually not possible, or desirable to capture all of the details of an industrial-strength program in a few comprehensive diagrams in a single model. There will need to be multiple class diagrams, state diagrams, and, of course, multitudes of sequence diagrams. In preparation for the guided inspection, the developers should organize the model to facilitate the review by creating additional diagrams that link existing ones or by revising diagrams to conform to the scope of the inspection.

One basic technique that makes the model more understandable is to layer the diagrams. This results in more individual diagrams, but each diagram is sufficiently modular to fit within the scope of a specific inspection. The diagrams are easier to create because they follow a pattern.

Figure 4.6 illustrates one type of layering for class diagrams in which classes are grouped into packages and those packages may be enclosed in another package. Additionally, we often show all of the specializations from an abstract class as one diagram (see Figure 4.7) and all of the aggregation relationships for a class in another diagram.

Figure 4.6. Class diagram layered into packages

Figure 4.7. Separating relationships

Figure 4.8 shows a technique for linking class diagrams. The work of one team uses the work of other teams. This can be shown by placing a class box from the other team on the edge of the team's diagram and showing the relationships between the classes. An inspection would be limited to the classes in the team's diagram. Messages to objects from the "boundary classes" would not be traced further. The return value, if any, would simply be noted.

Figure 4.8. Links between class diagrams

Figure 4.9 illustrates a layering for sequence diagrams. At one level, the diagram terminates at an interface or abstract class. A sequence diagram is then constructed for each class that implements the interface or specializes the abstract class.

Figure 4.9. Sequence diagram per interface implementation

Selecting Test Cases for the Inspection

There are usually many possible test cases that can be developed from any specific use case. Traditional testing techniques use techniques such as equivalence classes and logical paths through the program as ways to select effective test cases. Test cases can be selected to ensure that specific types of coverage are achieved or to find specific types of defects. We use Orthogonal Defect Classification to help select test cases that are most likely to identify defects by covering the different categories of system actions that trigger defects. We use a use profile to select test cases that give confidence in the reliability of the product by identifying which parts of the program are used the most.

Orthogonal Defect Classification as a Test Case Selector

Orthogonal Defect Classification (ODC) [Chill92] is a scheme developed at IBM based on an analysis of a large amount of data. The activities that caused a defect to be detected are classified as "triggers." These are divided into groups based on when the triggers occurred, such as during reviews and inspections. Figure 4.10 is a list of attributes that trigger defects during reviews and inspections. The guided inspection technique uses several of these triggers as a guide to selecting test cases. We will delineate several of these triggers as we proceed, but we will address a few of these now.

  1. Design conformance is addressed by comparing the basis model to the MUT as well as comparing the MUT to the requirements. This comparison is a direct result of the test case execution.

  2. Concurrency is a trigger that will be visible in the design model and scenarios can be generated that explicitly explore thread interactions. The UML activity diagram will be the primary source for symbolic execution.

  3. Lateral compatibility is activated by the trace of scenarios between objects on sequence diagrams.

Figure 4.10. ODC review and inspection triggers

By structuring the guided inspection process so that as many of these triggers as possible are encountered, you ensure that the tests that guide the inspection are more likely to "trigger" as many failures as possible.

Use Profiles as a Test Case Selector

A use profile (see Use Profiles on page 130) for a system is an ordering of the individual use cases based on a combination of the frequency and criticality values for the individual use cases. The traditional operational profile used for procedural systems is based strictly on frequency-of-use information. Combining the frequency and criticality ratings to order the use cases provides a more meaningful criteria for ensuring quality. For example, we might paint a logo in the lower right-hand corner of each window. This would be a relatively frequent event, but should it fail, the system will still be able to provide important functionality to the user. Likewise, attaching to the local database server would happen very seldom but the success of that operation is critical to the success of numerous other functions. The number of test cases per use case is adjusted based on the position of the use case in the ranking.

Risk as a Test Case Selector

Some testing methods use risk as the basis for determining how much to test. This is useful during development when we are actively searching for defects. It is not appropriate after development when we are trying to achieve some measure of reliability. At that time, the use profile technique supports testing the application in the way that it will be used.

Our use case template captures the information needed for each of the techniques so that they can be used throughout the complete life cycle. We use the frequency/criticality information instead of the risk information for guided inspection because we are trying to capture the same perspective as the testing of the system after development. For situations in which the inspection is only covering a portion of the design, using the risk information may be equally relevant.

Technique Summary Creating Test Cases from Use Cases

A test case consists of a set of preconditions, a stimulus (inputs), and the expected response. A use case contains a series of scenarios: the normal case, extensions, and exceptional cases. Each scenario includes the action taken by an actor and the required response from the system that corresponds to the basic parts of a test case. To construct a test case from the scenario, each part of the scenario is made more specific by giving exact values to all attributes and objects. This requires coordination between the use case diagram and the other diagrams. The "things" mentioned in the scenario should translate into some object or objects from the class diagram. Each of these objects should be in specific states defined in the state diagrams for those classes. The actions in the use case will correspond to messages to the objects.

Each scenario can result in multiple test cases by selecting different values (that is, states) for the objects used in the use case. The expected result part of the test case is derived from the scenario portion of the use case and the specific values provided in the input scenario. The following is a use case scenario and the corresponding test case.

  • Subsystem use case: A movablePiece receives a tick() message. It must then check to determine whether it collided with a stationaryPiece.

  • Test precondition: The puck is located within less than a tick of a brick and is headed for that brick.

  • Test case: Input The puck receives a tick message.

  • Expected result: The puck has changed direction and the brick has changed its state from active to kaput, indicating that it has broken, and disappears.

Creating Test Cases

Test cases for a guided inspection are scenarios that should be represented in the MUT. Before the requirements model is verified, the scenarios come from a team of domain experts who are not producing the requirements. Later, we will see how this is done. For now we will focus on test cases that are based on the system requirements.

The use case template that we use (see an abbreviated version in Figure 4.11) has three sources of scenarios. The Use Scenario is the "sunny-day" scenario that is most often the path taken. The Alternative Paths section may list several scenarios that differ from the use scenario in a variety of ways, but still represent valid executions. The Exceptional Paths section provides scenarios that result in error conditions.

Figure 4.11. An example of a use case

Completing Checklists

Prior to the interactive inspection session, the inspectors examine the models for certain syntactic information that can be evaluated just from the information contained in the model. This portion of the technique is not concerned with the content but only the form of the model. Figure 4.12 shows the checklist used during the design phase. The checklist is divided into two parts. One part addresses comparisons between the analysis model and the MUT. For example, the checklist reminds the inspector to check whether classes that have been deleted should have been deleted because of the differences between analysis and design information. The second part covers issues within the MUT. The checklist guides the inspector to consider whether the use of syntax correctly captures the information. For example, it guides the inspector to consider the navigability of the associations and whether they are correctly represented.

Figure 4.12. Design phase checklist

The Interactive Inspection Session

The testing portion of the guided inspection session is organized in one of two ways depending upon whether the model has been automated or not. If a prototype or other working model has been created, the session does not vary much from a typical code-testing session. The test cases provided by the testers are implemented, usually in some scripting language, and executed using the simulation facilities of the prototype of the model. These test cases must be more rigorously specified than the test cases that will be used in an interactive session with symbolic execution. The results of the execution are evaluated and the team determines whether the model passed the test or not.

If the model has not been prototyped, the testing session is an interactive session involving testers and developers. The developers cooperate to perform a symbolic execution that simulates the processing that will occur when actual code is available. That is, they walk the testers through the scenarios provided by the test cases.

The following additional roles are assigned to individuals in an interactive testing session. A person may take on the following roles simultaneously.

  • Moderator The moderator controls the session and advances the execution through the scenario. The session is not intended to debug, which the developers will want to do, nor to expand the requirements, which the domain experts will want to do. The moderator keeps the session moving over the intended material.

  • Recorder This person, usually a tester, makes annotations on the reference models as the team agrees that a fault has been found. The recorder makes certain that these faults are taken into consideration in the latter parts of the scenario so that time is not wasted on redundant identification of the same fault. The recorder also maintains a list of issues that are not resolved during the testing session. These may not be faults. Information may need to come from a team in another part of the project or a team member who is absent during the inspection.

  • Drawer This person constructs a sequence diagram as a scenario is executed. A drawer concentrates on capturing all of the appropriate details such as returns from messages and state changes. The drawer may also annotate the sequence diagram with information between the message arrow and the return arrow.

Use Profiles

One technique for allocating testing resources determines which parts of the application will be utilized the most and then tests those parts the most. The principle here is "test the most used, most critical parts of the program over a wider range of inputs than the lesser used, least critical portions to ensure the greatest user satisfaction." A use profile is a ranking of the use cases based on the combined frequency/criticality values. This can be viewed as a double sort of the use cases based on the number of times that an end-user function (interface method) is used, or is anticipated to be used, in the actual operation of the program and the criticality of each of these uses. The criticality is a value assigned by the domain experts and recorded in each use case. The frequency information can be obtained in a couple of ways.

First, data can be collected from actual use perhaps during usability testing or during the actual operation if we will be testing a future version of the product. This results in a raw count profile. The count for each behavior is divided by the total number of invocations to produce a percentage. A second approach is to reason about the meanings and responsibilities of the system interface and then estimate the relative number of times each method will be used. The result is an ordering of the end-user methods rather than a precise frequency count. The estimated number of invocations for each behavior is divided by the total number of invocations to provide a percentage. The percentage computed for each use determines the percentage of the test suite that should be devoted to that use.

As an example, the Exit function for Brickles will be successfully completed exactly once per invocation of the program but the NewGame method may be used numerous times. It is conceivable that the Help function might not be used at all during a use of the system. This results in a profile that indicates an ordering of NewGame, Exit, and Help. We can assign weights that reflect the relative frequency that we expect. If on average we would estimate that a player would play 10 games prior to EXITing the system, the weights would be 10, 1, 1. The NewGame function should be exercised in 82.5% (10 out of 12) of the test cases while the Help function and Exit should each constitute 8.5%.

The guided inspection session can easily slip into an interactive design session. The participants, particularly the developers, will typically want to change the model during the testing session as problems are encountered. Resist this urge. This is the classic confusion between testing and debugging and diverts attention from other defects that are found. The recorder captures the faults found by the inspection so that they can be addressed later. This keeps attention focused on the search for faults and prevents a "rush to judgment" about the precise cause of the defect. If a significant number of problems are found, end the session and let the developers work on the model.

Категории