Software Testing Fundamentals: Methods and Metrics

Let's consider the traditional definition of quality assurance. The following definition is taken from the British Standard, BS 4778:

Quality Assurance: All those planned and systematic actions necessary to provide adequate confidence that a product or service will satisfy given requirements for quality.

Testers and managers need to be sure that all activities of the test effort are adequate and properly executed. The body of knowledge, or set of methods and practices used to accomplish these goals, is quality assurance. Quality assurance is responsible for ensuring the quality of the product. Software testing is one of the tools used to ascertain the quality of software. In many organizations, the testers are also responsible for quality assurance-that is, ensuring the quality of the software. In the United States, few software development companies have full-time staff devoted to quality assurance. The reason for this lack of dedicated staff is simple. In most cases, traditional formal quality assurance is not a cost-effective way to add value to the product.

A 1995 report by Capers Jones, "Software Quality for 1995: What Works and What Doesn't," for Software Productivity Research, gives the performance of the four most common defect removal practices in the industry today: formal design and code inspections, formal quality assurance, and formal testing. The efficiency of bug removal for these methods used individually is as follows:

Formal design inspections

45%-68%

Formal software testing

37%-60%

Formal quality assurance

32%-55%

No formal methods at all

30%-50%

When taken in combination:

Formal design inspections and formal code inspections 70%-90%

The best combination:

Formal design inspections, formal quality assurance, formal testing 77%-95%

When used alone, formal quality assurance does only 5 percent better than no formal methods at all. It is not possible to determine its relative worth when used in combination with other methods. However, it can be argued that considering the following problems, the contribution of formal quality assurance is minimal.

Traditional Definitions of Quality That Are Not Applicable

Quality assurance defines quality as "the totality of features or characteristics of a product or service that bear on its ability to satisfy stated or implied needs." The British Standards 4778, and ISO 8402, from the International Standards Organization (ISO), definitions cite "fitness for purpose" and "conformance with requirements."

Quality is not a thing; it is the measure of a thing. Quality is a metric. The thing that quality measures is excellence. How much excellence does a thing possess? Excellence is the fact or condition of excelling; of superiority; surpassing goodness or merit.

The problem is that the methods put forward by the experts of the 1980s for achieving quality didn't work in the real market-driven world of the 1990s and probably won't be particularly useful for most commercial software makers in the coming decade. For example, Philip B. Crosby is probably best remembered for his book, Quality Is Free (Mentor Books, 1992). In it he describes in nontechnical terms his methods for installing, maintaining, and measuring a comprehensive quality improvement program in your business. The major emphasis is on doing things right the first time. Crosby maintains that this quality is free and that what costs dearly is the rework that you must do when you don't do it right at the get-go.

According to Mr. Crosby's teachings:

These concepts are most certainly laudable, but they require a very high level of discipline and maturity to carry out. The fact is that this set of concepts doesn't fit the commercial software development process. The reason for this is that the assumptions that they are based on are inaccurate in today's software development process. This situation is especially true in an environment where no one has ever gone before, and so no one knows what "right the first time" means.

Metaphorically speaking, the folks writing the definitions of quality and the procedures for achieving it were all from some major department store, but the market demand was going toward volume discount pricing. At the time of this writing, Wal-Mart is the dominant player in this field. Wal-Mart developed its own definitions for quality and invented its own methods for achieving it. It did its own market research and tailored its services to meet the actual (real) needs of that market. It didn't just leave it to the designers to guess. The other major point of distinction is Wal-Mart's overwhelming commitment to customer satisfaction. This sets it apart from most commercial software makers. Notice that there is nothing about customer satisfaction in Mr. Crosby's points. By the way, Wal-Mart is bigger than Microsoft.

Fact: 

If all you have is a hammer, then everything looks like a nail.

Get the right tool for the job. Overplanning and underplanning the product are two of the main failings in software development efforts today. While a safety-critical or high-reliability effort will fail if it is underplanned, in today's market, it will also fail if it falls into the trap of overplanning-trying to build too good a product for the technology environment and the market. The entrepreneurs are more concerned with planning to make money. They are not going to be bogged down by cumbersome quality assurance procedures that might give them only a marginal improvement.

So, on one end of the spectrum, we have the PC-based commercial software developers who have successfully marketed all manner of semifunctional and sometimes reliable products, and on the other end, we have the high-reliability and safety-critical software developers who must always provide reliable, functioning products. Over the years, consumers have come to expect the price and rapid release schedule of the entrepreneurial commercial software systems. The real problem started when they began to demand the same pricing and release/update schedule from the high-reliability folks. Mature companies like Boeing and Honeywell have faced a terrible challenge to their existence because they must maintain best-practice quality assurance and compete with the shrink-wrappers at the same time.

Some sobering thoughts ... I found it a truly terrifying experience when I realized that the software monitoring system I was testing on the Microsoft Windows platform would be monitoring critical systems in a nuclear power plant. This was the same operating system that would let my fellow testers lock up the entire air defense system network of a small but strategic country by moving the mouse back and forth too fast on an operator console. These are only a couple of examples of the types of compromises software developers and the market are making these days.

Some Faulty Assumptions

Formal quality assurance principles are based on a number of precepts that are not a good fit for the realities of commercial software development today. The following six precepts are among the most prevalent-and erroneous-in the field today.

Fallacy 1: Quality Requirements Dictate the Schedule

The Facts:

Traditional development models cannot keep up with the demand for consumer software products or the rapidly changing technology that supports them. Today's rich development environment and ready consumer market has sparked the imagination of an enormous number of entrepreneurs. Consequently, this market is incredibly competitive and volatile. Product delivery schedules are often based on a first-to-market strategy. This strategy is well expressed in this 1997 quote from Roger Sherman, director of testing at Microsoft Corporation:

Schedule is often thought to be the enemy of quality, but at Microsoft it is considered to be part of the quality of the product.

(Microsoft studied their market and made their own definitions of quality based on the needs of that market.) Most software developed in RAD/Agile shops has a life expectancy of 3 to 12 months. The technology it services-PCs, digitizers, fax/modems, video systems, and so on-generally turns over every 12 months. The maximum desirable life expectancy of a current hardware/software system in the commercial domain is between 18 and 24 months. In contrast, traditional quality assurance principles are geared for products with a design life expectancy measured in decades.

Fallacy 2: Quality = Reliability

This equation is interpreted as "zero defects is a requirement for a high-quality product."

The Facts:

The commercial software market (with a few exceptions) is not willing to pay for a zero-defect product or a 100 percent reliable product.

Users don't care about faults that don't ever become bugs, and users will forgive most bugs if they can work around them, especially if the features are great and if the price is right. For example, in many business network environments in 1994 and 1995, users religiously saved their work before trying to print it. The reason: About one in four print jobs submitted to a certain type of printer using a particular software printer driver would lock up the user's workstation and result in the loss of any unsaved work. Even though many thousands of users were affected, the bug was tolerated for many months because the effects could be limited to simply rebooting the user's workstation occasionally.

Safety-critical and mission-critical applications are the notable exceptions to this fact. Consumers are willing to pay for reliability when the consequences of a failure are potentially lethal. However, the makers of these critical software systems are faced with the same market pressures from competition and constantly changing technology as the consumer software makers.

Fallacy 3: Users Know What They Want

The Facts:

For example, if you asked several banking customers if they would like to be able to pay their bills online, many would say yes. But that response does not help the designer determine what type of bills customers will want to pay or how much they will use any particular type of payment feature. Consequently, in a well-funded development project, it is common to see every conceivable feature being implemented.

I once ported a client server application to the Web that produced 250 different reports on demand. When I researched the actual customer usage statistics to determine which reports were the most requested and therefore the most important to implement first, I discovered that only 30 of these 250 reports had ever been requested. But each one had been implemented to satisfy a customer request.

Fallacy 4: The Requirements Will Be Correct

This fallacy assumes that designers can produce what the users want the first time, without actually building product or going through trial-and-error cycles.

The Facts:

Fallacy 5: Users Will Accept a Boring Product if the Features and Reliability Are Good

The Facts:

For example, let's consider color and graphics. DOS, with its simple black-and-green appearance on screen, was very reliable compared to the first several Windows operating systems, yet it passed away and became extinct.

Color printers have come to dominate the printer world in only a few short years. The cost to purchase one may be low, but the life expectancy is short. The cost of ownership is high (color ink is very expensive), yet they have become the status quo, successfully supplanting the tried-and-true, fast, reliable, and economical black-and-white laser printer.

Third Generation (3G) cell phones don't have 3G networks to support them yet in the United States, yet because of their brilliant color displays, their ability to use picture screen savers, and their ability to play tunes, they are outselling excellent 2G cell phones that offer superior feature sets that work reliably in today's cellular networks.

Fallacy 6: Product Maturity Is Required

The Facts:

The very mature premier high-end digital video creation software system has been supplanted by two new software editing systems that provide about 10 percent of the features it does, at 10 percent of the price. In addition, the new systems can be purchased and downloaded over the Internet, whereas the premier system cannot. We are also seeing this trend in large system software. The typical scenario involves dropping a massive, entrenched, expensive client/server system and replacing it with a lightweight, Web-based, database-driven application.

This relates also to Fallacy 3: Users know what they want. When analysis is performed on the current system, a frequent discovery is that the customers are paying for lots of features they are not using. Once the correct feature set has been determined, it can often be implemented quickly in a new Web-based application-where it can run very inexpensively.

Feature maturity is a far more important consideration than product maturity. As I have already pointed out, most consumers have realized that the "latest release" of software is not necessarily more reliable than the previous release. So product maturity is most often a myth. A mature product or system is typically overburdened by feature bloat.

Категории