An Overview of the Testing Process
Overview
"Few things are harder to put up with than the annoyance of a good example."
— Mark Twain
For several years, our clients have told us that we should write a book about the processes and methods that we use to test software, so, with a lot of help, that's what we've done. Specifically, the processes we use are based upon a methodology called STEP™, which was created by Dr. Bill Hetzel and Dr. David Gelperin as a way to implement the original IEEE-829 Standard for Test Documentation.
A Brief History of Testing
STEP was originally developed out of a frustration that, although the IEEE standard did a good job of specifying what testing documents needed to be built, they didn't describe how to create them or how to develop the processes (planning, analysis, design, execution, etc.) needed to use them. The STEP methodology (and therefore this book) doesn't establish absolute rules that must be followed but rather describes guidelines that can and should be modified to meet the needs and expectations of the software engineers using them. Even as we write this book, thousands of our past and present clients and students are using their own version of the STEP methodology and its underlying processes to build and implement quality software.
However, before we launch into the ins and outs of STEP, it's instructional to review the state of software testing prior to the launch of STEP, during its creation, and today. A good starting point is to review the definitions of testing (shown in Table 1-1) published by the authors at each of these times.
Year |
Definition |
---|---|
1979 |
Testing is the process of executing a program or system with the intent of finding errors. |
1983 |
Testing is any activity aimed at evaluating an attribute of a program or system. Testing is the measurement of software quality. |
2002 |
Testing is a concurrent lifecycle process of engineering, using, and maintaining testware in order to measure and improve the quality of the software being tested. |
Key Point |
"Innovate! Follow the standard and do it intelligently. That means including what you know needs to be included regardless of what the standard says. It means adding additional levels or organization that make sense." - IEEE Computer Society Software Engineering Standards Collection |
In 1979, Glenford Myers explained, "Testing is the process of executing a program or system with the intent of finding errors," in his classic book, The Art of Software Testing. At the time Myers' book was written, his definition was probably the best available and mirrored the thoughts of the day. Simply stated, testing occurred at the end of the software development cycle and its main purpose was to find errors.
If we skip forward to 1983, we find that the definition of testing had changed to include an assessment of the quality of the software, rather than merely a process to find defects. In The Complete Guide to Software Testing, Bill Hetzel stated that, "Testing is any activity aimed at evaluating an attribute of a program or system. Testing is the measurement of software quality."
Myers' and Hetzel's definitions are still valid today because they each address a particular facet of software testing. But, the problem with these definitions is scope. To resolve this problem, we offer the following definition of testing that will be used throughout this book:
Key Point |
Philip Crosby's definition of quality is "conformance to requirements. Lack of conformance is lack of quality." Dr. Joseph M. Juran's definition of quality is "the presence of that which satisfies customers and users and the absence of that which dissatisfies." |
Testing is a concurrent lifecycle process of engineering, using and maintaining testware in order to measure and improve the quality of the software being tested.
Notice that no direct mention was made of finding defects, although that's certainly still a valid goal of testing. Also note that our definition includes not only measuring, but also improving the quality of the software. This is known as preventive testing and will be a consistent theme throughout this book.
Preventive Testing
Preventive testing uses the philosophy that testing can actually improve the quality of the software being tested if it occurs early enough in the lifecycle. Specifically, preventive testing requires the creation of test cases to validate the requirements before the code is written. Suppose, for example, that the user of an Automated Teller Machine (ATM) specified the following requirement:
A valid user must be able to withdraw up to $200 or the maximum amount in the account.
Key Point |
Preventive testing uses the philosophy that testing can actually improve the quality of the software being tested if it occurs early enough in the lifecycle. |
We know that some of you are already thinking, "What a horrible requirement." But we also know that many of you are thinking, "Wouldn't it be nice to have such a good requirement?" And, some of you are even thinking, "So that's what a requirement looks like?" Whatever you think about our sample requirement and no matter how good your requirement specifications are, they're certain to have inaccuracies, ambiguities, and omissions. And problems in the requirements can be very expensive to fix, especially if they aren't discovered until after the code is written, because this may necessitate the rewriting of the code, design and/or requirements.
Preventive testing attempts to avoid this situation by employing a very simple notion: the process of writing the test cases to test a requirement (before the design or code is completed) can identify flaws in the requirements specification.
Now, let's get back to our ATM example. Table 1-2 briefly describes two (of many) test cases that you might write.
Test Case |
Description |
Results |
---|---|---|
TC01 |
Withdraw $200 from an account with $165 in it. |
??? |
TC02 |
Withdraw $168.46 from an account with $200 in it. |
??? |
Key Point |
The process of writing the test cases to test a requirement (before the design or code is completed) can identify flaws in the requirements specification. |
Should TC01 pass? It depends on how you interpret the "or" in the requirement: A valid user must be able to withdraw up to $200 or the maximum amount in the account. Some people will interpret it to mean that the ATM user can withdraw the lesser of the two values ($165), while other people will interpret it to mean they can withdraw the greater of the two values ($200). Congratulations, you've discovered an ambiguity in the requirements specifications that can lead to many problems down the road.
Key Point |
Testware is any document or product created as part of the testing effort (e.g., test cases, test plans, etc.). Testware is to testing what software is to development. |
Should TC02 pass? It should according to the specification. But do you really think that the bank wants the ATM to dispense coins to the users? Well, maybe, but we doubt it. Some of you may be saying that no programmer would ever write the code to do this. Think again, this is a real example and the programmer did indeed write the code to allow the withdrawal of odd amounts.
By writing the test cases before the code was written, we were able to find some (in this case, obvious) problems. We found them early enough that it's a relatively simple and inexpensive job to correct them. An added benefit of creating the test cases before the code is that the test cases themselves help document the software. Think how much easier it would be to write code if instead of just having requirements specifications to base your code on, you could also use the test cases that were created to test the system as part of the system documentation.
Key Point |
An added benefit of creating the test cases before the code is that the test cases themselves help document the software. |
Where Are Most Companies Today?
After reading the example above, we hope that most of you will think that the philosophy of preventive testing is clearly sound. Preventive testing is certainly not a new idea, so everyone must be using it, right? Well, not exactly. Our experience at most of the organizations we visit each year is that software is still developed using some kind of sequential model where the requirements are built, then the design, then the code, and finally the testing begins. The most famous of the sequential models of software development is the Waterfall model shown in Figure 1-1.
Figure 1-1: Waterfall Model of Software Development
Key Point |
Our experience at most of the organizations we visit each year is that software is still developed using some kind of sequential model where the requirements are built, then the design, then the code, and finally the testing begins. |
Although at first it appears that once a phase is complete there's "no going back," this is not necessarily true. There are usually one or more returns to a previous phase from a current phase due to overlooked elements or surprises. The difficulty arises when you have to back up more than one phase, especially in later phases. The costs of rework, re-testing, re-documenting, etc. become very high and usually result in shortcuts and bypasses. As Steve McConnell explains in his book Rapid Development, "late changes in the Waterfall model are akin to salmon swimming upstream - it isn't impossible, just difficult."
When a sequential model like the Waterfall model is used for software development, testers should be especially concerned with the quality, completeness, and stability of the requirements. Failure to clarify and define requirements at the beginning of the project will likely result in the development of a software design and code that's not what the users wanted or needed. Worse, the discovery of these defects will be delayed until the end of the lifecycle (i.e., test execution).
There are actually a few advantages to the Waterfall model, the most obvious one being that the resources are largely focused on one activity at a time, and the next activity has the (hopefully) completed artifact from the previous stage to use as the basis for the next stage. However, as you will see, in addition to a few good features, the Waterfall model has many problems.
Sequential models are, in particular, difficult to use successfully from the testing viewpoint. You can see in Figure 1-1 that in the Waterfall model, testing is largely ignored until the end, and indeed that's exactly how it works today in many companies around the world. If testing occurs only (or largely) after the product has already been built, then the most that the testers can hope to accomplish is to find bugs in the finished product. (This would be like discovering you forgot to put the chocolate chips in the cookies until you were testing - eating - them. Sure, you could take a bite of the cookie and then throw down a couple of chocolate chips, but the effect is really not the same. You would have to settle for chocolate chipless cookies or start over.) If testing occurs only at the end, there's a lot of "starting over" going on.
Key Point |
The Waterfall model is particularly difficult to use successfully from the testing viewpoint. |
Another problem with the Waterfall model is that the testers will almost always find themselves on the critical path of delivery of the software. This is exacerbated because all too often the software is delivered to the testers late, and the schedule is cast in stone and cannot be changed. The result, of course, is that the window of opportunity for testing is constantly shrinking.
The STEP process described in this book can be used with any software development methodology (e.g., XP, RAD, Prototyping, Spiral, DSDM). If used with a sequential model of software development like the Waterfall model, many of the problems described earlier can be overcome (i.e., the use of the STEP testing methodology will transform a sequential model into an iterative model).
Why Is Testing So Difficult?
To the uninitiated, testing software seems like one of the easiest things imaginable. You try the software and either it works or it doesn't. But there has to be more to it than this or companies wouldn't spend 20, 30, 40, 50 percent or more of the software development budget on testing. So why is testing so difficult? We've already encountered some of the difficulties in testing: ambiguous and incorrect requirements, and tight time schedules. There are, unfortunately, many more difficulties in testing.
Case Study 1-1: Different testers may have different reasons why they think testing is difficult, but they all seem to agree that IT IS DIFFICULT!
Why Is Testing Difficult?
When we ask a group of testers the question, "Why is testing difficult?" we get fairly consistent (and lengthy) answers. The reply we received from our friend Clare when we asked her to give us a fresh perspective on an early version of this book sums up many of the difficulties in testing:
Sure! Not only can I give you a fresh perspective, but by being in the trenches every day, I can offer a reality check quite well. I am sadly well-versed in doing whatever it takes to test, which includes working with ridiculous time frames, bad to no requirements, testers who need to be trained in testing, politics in test responsibility, providing data in as neutral a way as possible when notifying development and marketing of the state of the product. You name it… I think I've experienced it all.
— Clare Matthews
Clare is not the only tester experiencing difficulties in testing, so let's get to work. We'll start by describing a high-level overview of STEP and where each of the facets of this methodology is covered in this book.
STEP Methodology
The Systematic Test and Evaluation Process (STEP) was first introduced in 1985 as part of the course material for the Systematic Software Testing seminar series. It has since been revised many times and field-tested through consulting engagements and the shared experience of many individuals and organizations. STEP is built upon the foundation of the IEEE Std. 829-1983 Standard for Software Test Documentation and subsequently updated based on the latest version (IEEE Std. 828-1998) of this standard and the IEEE Std. 1008-1987 Standard for Software Unit Testing. While retaining compatibility with these standards, this methodology has grown in scope and now stands as one of the leading models for effective software testing throughout the industry.
Key Point |
Much of this section was reprinted from the STEP Guide with permission from Software Quality Engineering. |
Scope and Objectives of STEP
STEP covers the broad activity of software evaluation. Evaluation is defined as that sub-discipline of software engineering concerned with determining whether software products do what they are supposed to do. The major techniques employed in evaluation are analysis, review and test. STEP focuses on testing as the most complex of the three, but stresses overall coordination and planning of all aspects of evaluation as a key to success. It stresses the prevention potential of testing, with defect detection and demonstration of capability as secondary goals.
Key Point |
Evaluation is defined as the sub-discipline of software engineering concerned with determining whether software products do what they are supposed to do. |
Early views saw testing as a phase that occurred after software development, or "something that programmers did to get the bugs out of their programs." The more modern view sees testing as a process to be performed in parallel with the software development or maintenance effort (refer to Figure 1-2) incorporating the activities of planning (determining risks and selecting strategies); analysis (setting test objectives and requirements); design (specifying tests to be developed); implementation (constructing or acquiring the test procedures and cases); execution (running and rerunning the tests); and maintenance (saving and updating the tests as the software changes).
Figure 1-2: Views of Testing
This lifecycle perspective of testing represents a major change from just a few years ago, when many equated testing with executing tests. The contribution of planning, analyzing, and designing tests was under-recognized (and still is by many people), and testing was not seen as really starting until tests started running. Now we understand the evaluation power of test planning and analysis. These activities can be more powerful than test execution in defect prevention and timely detection. We also understand that an accurate interpretation of the situation when "all tests are running successfully" requires a clear understanding of the test design.
The lifecycle model for testing that has emerged borrows heavily from the methodology we've grown accustomed to for software. Considering that a test set is made up of data and procedures (which are often implemented as executable test programs), it should not come as a surprise that what it takes to build good software is also what it takes to build good testware!
Elements of STEP
STEP draws from the established foundation of software methodologies to provide a process model for software testing. The methodology consists of specified tasks (individual actions); work products (documentation and implemented tests); and roles (defined responsibilities associated with groups of tasks), as shown in Figure 1-3, packaged into a system with proven effectiveness for consistently achieving quality software.
Figure 1-3: Elements of STEP
The STEP methodology is not tool dependent and does not assume any particular test organization or staffing (such as independent test groups). It does assume a development (not a research) effort, where the requirements information for the product and the technical design information are comprehensible and available for use as inputs to testing. Even if the requirements and design are not specified, much of the STEP methodology can still be used and can, in fact, facilitate the analysis and specification of software requirements and design.
Key Point |
Even if the requirements and design are not specified, much of the STEP methodology can still be used and can, in fact, facilitate the analysis and specification of requirements and design. |
STEP Architecture
Figure 1-4 shows how STEP assumes that the total testing job is divided into levels during planning. A level represents a particular testing environment (e.g., unit testing usually refers to the level associated with program testing in a programmer's personal development library). Simple projects, such as minor enhancements, may consist of just one or two levels of testing (e.g., unit and acceptance). Complex projects, such as a new product development, may have more levels (e.g., unit, function, subsystem, system, acceptance, alpha, beta, etc.).
Figure 1-4: Activity Timing at Each Level of Test
STEP provides a model that can be used as a starting point in establishing a detailed test plan. All of the components of the model are intended to be tailored and revised, or extended to fit each particular test situation.
The three major phases in STEP that are employed at every level include: planning the strategy (selecting strategy and specifying levels and approach), acquiring the testware (specifying detailed test objectives, designing and implementing test sets), and measuring the behavior (executing the tests and evaluating the software and the process). The phases are further broken down into eight major activities, as shown in Table 1-3.
Table 1-3: STEP Activities & Their Locations in This Book
Step 1 |
Plan the Strategy |
Covered In |
---|---|---|
P1 |
Establish the master test plan. |
Chapters 2 and 3 |
P2 |
Develop the detailed test plans. |
Chapter 4 |
Step 2 |
Acquire the Testware |
|
---|---|---|
A1 |
Inventory the test objectives (requirements-based, design-based, and implementation-based). |
Chapter 5 |
A2 |
Design the tests (architecture and environment, requirements-based, design-based, and implementation-based). |
Chapter 5 |
A3 |
Implement the plans and designs. |
Chapter 6 |
Step 3 |
Measure the Behavior |
|
---|---|---|
M1 |
Execute the tests. |
Chapter 7 |
M2 |
Check the adequacy of the test set. |
Chapter 7 |
M3 |
Evaluate the software and testing process. |
Chapter 11 |
NOTE: Chapters 8, 9, and 10 cover the testing organization, the software tester, and the test manager, respectively. Chapter 12 provides a review of critical testing processes. |
Timing of STEP Activities
STEP specifies when the testing activities and tasks are to be performed, as well as what the tasks should be and their sequence, as shown in Figure 1-5. The timing emphasis is based on getting most of the test design work completed before the detailed design of the software. The trigger for beginning the test design work is an external, functional, or black box specification of the software component to be tested. For higher test levels (e.g., acceptance or system), the external specification is equivalent to the system requirements document. As soon as that document is available, work can (and should) begin on the design of the requirements-based tests.
Figure 1-5: Activity Timing at Various Levels of Test
The test design process continues as the software is being designed and additional tests based on the detailed design of the software are identified and added to the requirements-based tests. As the software design process proceeds, detailed design documents are produced for the various software components and modules comprising the system. These, in turn, serve as functional specifications for the component or module, and thus may be used to trigger the development of requirements-based tests at the component or module level. As the software project moves to the coding stage, a third increment of tests is designed based on the code and implementation details.
Key Point |
The goal at each level is to complete the bulk of the test design work as soon as possible. |
Test inventory and design activities at the various levels overlap. The goal at each level is to complete the bulk of the test design work as soon as possible. This helps to ensure that the requirements are "testable" and well thought out and that defects are discovered early in the process. This strategy supports an effective software review and inspection program.
Measurement phase activities are conducted by level. Units are executed first, then modules or functions are integrated and system and acceptance execution is performed. The sequential execution from small pieces to big pieces is a physical constraint that we must follow. A major contribution of the methodology is in pointing out that the planning and acquisition phases are not so constrained; and furthermore, it's in our interest to reverse the order and begin to develop the high-level test sets first - even though we use them last!
The timing within a given test level is shown in Figure 1-6 and follows our natural expectation. Plans and objectives come first, then test design, then implementation, then finally execution and evaluation. Overlap of activities is possible.
Figure 1-6: Activity Timing at Various Levels of Test
Work Products of STEP
Another aspect of the STEP process model is the set of work products produced in each phase and activity. STEP uses the word "testware" to refer to the major testing products such as test plans and test specification documents and the implemented test procedures, test cases, and test data files. The word "testware" is intentionally analogous to software and, as suggested by Figure 1-7, is intended to reflect a parallel development process. As the software is designed, specified, and built, the testware is also designed, specified, and built.
Figure 1-7: Parallel, Mutually Supportive Development
These two broad classes of work products support each other. Testware development, by relying on software work products, supports the prevention and detection of software faults. Software development, by reviewing testware work products, supports the prevention and detection of testware faults.
STEP uses IEEE standard document templates as a recommended guideline for document structure and content. Figure 1-8 lists the documents that are included in this book.
IEEE Std. 829-1998 Standard for Software Test Documentation Template for Test Documents
Contents
1. |
Test Plan Used for the master test plan and level-specific test plans. |
2. |
Test Design Specification Used at each test level to specify the test set architecture and coverage traces. |
3. |
Test Case Specification Used as needed to describe test cases or automated scripts. |
4. |
Test Procedure Specification Used to specify the steps for executing a set of test cases. |
5. |
Test Log Used as needed to record the execution of test procedures. |
6. |
Test Incident Report Used to describe anomalies that occur during testing or in production. These anomalies may be in the requirements, design, code, documentation, or the test cases themselves. Incidents may later be classified as defects or enhancements. |
7. |
Test Summary Report Used to report completion of testing at a level or a major test objective within a level. |
Figure 1-8: Template for Test Documents from IEEE Std. 829-1998. The templates for many IEEE documents are presented in this book, but we recommend that you purchase the complete guidelines from the IEEE at— www.ieee.org
Implementations are the actual test procedures to be executed along with their supporting test data and test files or test environments and any supporting test code that is required.
Roles and Responsibilities in STEP
Roles and responsibilities for various testing activities are defined by STEP. The four major roles of manager, analyst, technician, and reviewer are listed in Table 1-4.
Role |
Description of Responsibilities |
---|---|
Manager |
Communicate, plan, and coordinate. |
Analyst |
Plan, inventory, design, and evaluate. |
Technician |
Implement, execute, and check. |
Reviewer |
Examine and evaluate. |
These roles are analogous to their counterpart roles in software development. The test manager is responsible for providing overall test direction and coordination, and communicating key information to all interested parties. The test analyst is responsible for detailed planning, inventorying of test objectives and coverage areas, test designs and specifications, and test review and evaluation. The test technician is responsible for implementation of test procedures and test sets according to the designs provided by the analyst, for test execution and checking of results for termination criteria, and for test logging and problem reporting. The test reviewer provides review and oversight over all steps and work products in the process.
The STEP methodology does not require that these roles be filled by different individuals. On small projects, it's possible that one person may wear all four hats: manager, analyst, technician, and reviewer. On larger projects and as a test specialty becomes more refined in an organization, the roles will tend to be assigned to different individuals and test specialty career paths will develop.
Key Point |
On smaller projects, it's possible that one person may wear all four hats: manager, analyst, technician, and reviewer. |
Summary of STEP
STEP has been introduced through Software Quality Engineering's (SQE) Systematic Software Testing classes to hundreds of organizations. It's a proven methodology offering significant potential for improving software quality in most companies.
Key differences between STEP and prevalent industry practices are highlighted in Table 1-5. First is the overall goal of the testing activity. STEP is prevention oriented, with a primary focus on finding requirements and design defects through early development of test designs. This results in the second major difference of when major testing activities are begun (e.g., planning timing and activity timing). In STEP, test planning begins during software requirements definition, and testware design occurs in parallel with software design and before coding. Prevalent practice is for planning to begin in parallel with coding and test development to be done after coding.
Methodology |
Focus |
Planning Timing |
Acquisition Timing |
Coverage |
Visibility |
---|---|---|---|---|---|
STEP |
Prevention & Risk Management |
Begins During Requirements Definition |
Begins During Requirements Definition |
Known (Relative to Inventories) |
Fully Documented & Evaluated |
Prevalent Industry Practice |
Detection & Demonstration |
Begins After Software Design |
Begins After Software Design (or Code) |
Largely Unknown |
Largely Undocumented with Little or No Evaluation |
Another major difference between STEP and prevalent industry practices is the creation of a group of test cases with known coverage (i.e., mapping test cases to inventories of requirements, design, and code). Finally, using the IEEE documents provides full documentation (i.e., visibility) of testing activities.
Key Point |
In STEP, test planning begins during software requirements definition and testware design occurs in parallel with software design and before coding. |
STEP also requires careful and systematic development of requirements and design-based coverage inventories and for the resulting test designs to be calibrated to these inventories. The result is that in STEP, the test coverage is known and measured (at least with respect to the listed inventories). Prevalent practice largely ignores the issue of coverage measurement and often results in ad hoc or unknown coverage.
A final major difference lies in the visibility of the full testing process. Every activity in STEP leads to visible work products. From plans, to inventories, to test designs, to test specs, to test sets, to test reports, the process is visible and controlled. Industry practice provides much less visibility, with little or no systematic evaluation of intermediate products.
These differences are significant and not necessarily easy to put into practice. However, the benefits are equally significant and well worth the difficulty and investment.
Key Point |
Calibration is the term used to describe the measurement of coverage of test cases against an inventory of requirements and design attributes. |