Software Testing Fundamentals: Methods and Metrics

2017-07-07 02:10:07

Let's consider the traditional definition of quality assurance. The following definition is taken from the British Standard, BS 4778:

Quality Assurance: All those planned and systematic actions necessary to provide adequate confidence that a product or service will satisfy given requirements for quality.

Testers and managers need to be sure that all activities of the test effort are adequate and properly executed. The body of knowledge, or set of methods and practices used to accomplish these goals, is quality assurance. Quality assurance is responsible for ensuring the quality of the product. Software testing is one of the tools used to ascertain the quality of software. In many organizations, the testers are also responsible for quality assurance-that is, ensuring the quality of the software. In the United States, few software development companies have full-time staff devoted to quality assurance. The reason for this lack of dedicated staff is simple. In most cases, traditional formal quality assurance is not a cost-effective way to add value to the product.

A 1995 report by Capers Jones, "Software Quality for 1995: What Works and What Doesn't," for Software Productivity Research, gives the performance of the four most common defect removal practices in the industry today: formal design and code inspections, formal quality assurance, and formal testing. The efficiency of bug removal for these methods used individually is as follows:

Formal design inspections	45%-68%
Formal software testing	37%-60%
Formal quality assurance	32%-55%
No formal methods at all	30%-50%

When taken in combination:

Formal design inspections and formal code inspections 70%-90%

The best combination:

Formal design inspections, formal quality assurance, formal testing 77%-95%

When used alone, formal quality assurance does only 5 percent better than no formal methods at all. It is not possible to determine its relative worth when used in combination with other methods. However, it can be argued that considering the following problems, the contribution of formal quality assurance is minimal.

Traditional Definitions of Quality That Are Not Applicable

Quality assurance defines quality as "the totality of features or characteristics of a product or service that bear on its ability to satisfy stated or implied needs." The British Standards 4778, and ISO 8402, from the International Standards Organization (ISO), definitions cite "fitness for purpose" and "conformance with requirements."

Quality is not a thing; it is the measure of a thing. Quality is a metric. The thing that quality measures is excellence. How much excellence does a thing possess? Excellence is the fact or condition of excelling; of superiority; surpassing goodness or merit.

The problem is that the methods put forward by the experts of the 1980s for achieving quality didn't work in the real market-driven world of the 1990s and probably won't be particularly useful for most commercial software makers in the coming decade. For example, Philip B. Crosby is probably best remembered for his book, Quality Is Free (Mentor Books, 1992). In it he describes in nontechnical terms his methods for installing, maintaining, and measuring a comprehensive quality improvement program in your business. The major emphasis is on doing things right the first time. Crosby maintains that this quality is free and that what costs dearly is the rework that you must do when you don't do it right at the get-go.

According to Mr. Crosby's teachings:

The definition of quality is "conformance with requirements."

The system for achieving quality is "prevention, not cure."

The measure of success is "the cost of quality."

The target goal of the quality process is "Zero defects-get it right the first time."

These concepts are most certainly laudable, but they require a very high level of discipline and maturity to carry out. The fact is that this set of concepts doesn't fit the commercial software development process. The reason for this is that the assumptions that they are based on are inaccurate in today's software development process. This situation is especially true in an environment where no one has ever gone before, and so no one knows what "right the first time" means.

Metaphorically speaking, the folks writing the definitions of quality and the procedures for achieving it were all from some major department store, but the market demand was going toward volume discount pricing. At the time of this writing, Wal-Mart is the dominant player in this field. Wal-Mart developed its own definitions for quality and invented its own methods for achieving it. It did its own market research and tailored its services to meet the actual (real) needs of that market. It didn't just leave it to the designers to guess. The other major point of distinction is Wal-Mart's overwhelming commitment to customer satisfaction. This sets it apart from most commercial software makers. Notice that there is nothing about customer satisfaction in Mr. Crosby's points. By the way, Wal-Mart is bigger than Microsoft.

Fact:

If all you have is a hammer, then everything looks like a nail.

Get the right tool for the job. Overplanning and underplanning the product are two of the main failings in software development efforts today. While a safety-critical or high-reliability effort will fail if it is underplanned, in today's market, it will also fail if it falls into the trap of overplanning-trying to build too good a product for the technology environment and the market. The entrepreneurs are more concerned with planning to make money. They are not going to be bogged down by cumbersome quality assurance procedures that might give them only a marginal improvement.

So, on one end of the spectrum, we have the PC-based commercial software developers who have successfully marketed all manner of semifunctional and sometimes reliable products, and on the other end, we have the high-reliability and safety-critical software developers who must always provide reliable, functioning products. Over the years, consumers have come to expect the price and rapid release schedule of the entrepreneurial commercial software systems. The real problem started when they began to demand the same pricing and release/update schedule from the high-reliability folks. Mature companies like Boeing and Honeywell have faced a terrible challenge to their existence because they must maintain best-practice quality assurance and compete with the shrink-wrappers at the same time.

Some sobering thoughts ... I found it a truly terrifying experience when I realized that the software monitoring system I was testing on the Microsoft Windows platform would be monitoring critical systems in a nuclear power plant. This was the same operating system that would let my fellow testers lock up the entire air defense system network of a small but strategic country by moving the mouse back and forth too fast on an operator console. These are only a couple of examples of the types of compromises software developers and the market are making these days.

Some Faulty Assumptions

Formal quality assurance principles are based on a number of precepts that are not a good fit for the realities of commercial software development today. The following six precepts are among the most prevalent-and erroneous-in the field today.

Fallacy 1: Quality Requirements Dictate the Schedule

The Facts:

For most software systems, market forces and competition dictate the schedule.

Traditional development models cannot keep up with the demand for consumer software products or the rapidly changing technology that supports them. Today's rich development environment and ready consumer market has sparked the imagination of an enormous number of entrepreneurs. Consequently, this market is incredibly competitive and volatile. Product delivery schedules are often based on a first-to-market strategy. This strategy is well expressed in this 1997 quote from Roger Sherman, director of testing at Microsoft Corporation:

Schedule is often thought to be the enemy of quality, but at Microsoft it is considered to be part of the quality of the product.

(Microsoft studied their market and made their own definitions of quality based on the needs of that market.) Most software developed in RAD/Agile shops has a life expectancy of 3 to 12 months. The technology it services-PCs, digitizers, fax/modems, video systems, and so on-generally turns over every 12 months. The maximum desirable life expectancy of a current hardware/software system in the commercial domain is between 18 and 24 months. In contrast, traditional quality assurance principles are geared for products with a design life expectancy measured in decades.

Fallacy 2: Quality = Reliability

This equation is interpreted as "zero defects is a requirement for a high-quality product."

The Facts:

Reliability is only one component of the quality of a product.

The commercial software market (with a few exceptions) is not willing to pay for a zero-defect product or a 100 percent reliable product.

Users don't care about faults that don't ever become bugs, and users will forgive most bugs if they can work around them, especially if the features are great and if the price is right. For example, in many business network environments in 1994 and 1995, users religiously saved their work before trying to print it. The reason: About one in four print jobs submitted to a certain type of printer using a particular software printer driver would lock up the user's workstation and result in the loss of any unsaved work. Even though many thousands of users were affected, the bug was tolerated for many months because the effects could be limited to simply rebooting the user's workstation occasionally.

Safety-critical and mission-critical applications are the notable exceptions to this fact. Consumers are willing to pay for reliability when the consequences of a failure are potentially lethal. However, the makers of these critical software systems are faced with the same market pressures from competition and constantly changing technology as the consumer software makers.

Fallacy 3: Users Know What They Want

The Facts:

User expectations are vague and general, not detailed and feature-specific. This situation is especially true for business software products. This phenomenon has led to something that we call feature bloat.

For example, if you asked several banking customers if they would like to be able to pay their bills online, many would say yes. But that response does not help the designer determine what type of bills customers will want to pay or how much they will use any particular type of payment feature. Consequently, in a well-funded development project, it is common to see every conceivable feature being implemented.

I once ported a client server application to the Web that produced 250 different reports on demand. When I researched the actual customer usage statistics to determine which reports were the most requested and therefore the most important to implement first, I discovered that only 30 of these 250 reports had ever been requested. But each one had been implemented to satisfy a customer request.

Fallacy 4: The Requirements Will Be Correct

This fallacy assumes that designers can produce what the users want the first time, without actually building product or going through trial-and-error cycles.

The Facts:

Designers are commonly asked to design products using technology that is brand new and poorly understood. They are routinely asked to guess how the users will use the product, and they design the logic flow and interface based on those guesses.

Designers are people, and people evolve good designs through trial-and-error experimentation. Good requirements also evolve during development through trial-and-error experimentation. They are not written whole at the outset. A development process that does not allow sufficient time for design, test, and fix cycles will fail to produce the right product.

Fallacy 5: Users Will Accept a Boring Product if the Features and Reliability Are Good

The Facts:

To make an excellent product, we must consistently meet or exceed user expectations. For example, text-based Web browsers in cell phones have failed to captivate the consumer (up till now), even though they provide fast and efficient use of the slow data transmission rates inherent in the current cellular networks.

Software must be innovative in order to compete. The software leads the users to new accomplishments. Some examples of competitive innovations in the home consumer market include digital video editing, 3D animation, imaging, and video conferencing.

As corollaries of the preceding facts, the software must provide a competitive advantage to the business user and it must educate users.

For example, let's consider color and graphics. DOS, with its simple black-and-green appearance on screen, was very reliable compared to the first several Windows operating systems, yet it passed away and became extinct.

Color printers have come to dominate the printer world in only a few short years. The cost to purchase one may be low, but the life expectancy is short. The cost of ownership is high (color ink is very expensive), yet they have become the status quo, successfully supplanting the tried-and-true, fast, reliable, and economical black-and-white laser printer.

Third Generation (3G) cell phones don't have 3G networks to support them yet in the United States, yet because of their brilliant color displays, their ability to use picture screen savers, and their ability to play tunes, they are outselling excellent 2G cell phones that offer superior feature sets that work reliably in today's cellular networks.

Fallacy 6: Product Maturity Is Required

The Facts:

Product maturity has little to do with the consumer's buying decision. Price and availability are far more important considerations in most business scenarios.

The very mature premier high-end digital video creation software system has been supplanted by two new software editing systems that provide about 10 percent of the features it does, at 10 percent of the price. In addition, the new systems can be purchased and downloaded over the Internet, whereas the premier system cannot. We are also seeing this trend in large system software. The typical scenario involves dropping a massive, entrenched, expensive client/server system and replacing it with a lightweight, Web-based, database-driven application.

This relates also to Fallacy 3: Users know what they want. When analysis is performed on the current system, a frequent discovery is that the customers are paying for lots of features they are not using. Once the correct feature set has been determined, it can often be implemented quickly in a new Web-based application-where it can run very inexpensively.

Feature maturity is a far more important consideration than product maturity. As I have already pointed out, most consumers have realized that the "latest release" of software is not necessarily more reliable than the previous release. So product maturity is most often a myth. A mature product or system is typically overburdened by feature bloat.

Traditional Definitions of Quality That Are Not Applicable

Some Faulty Assumptions

Fallacy 1: Quality Requirements Dictate the Schedule

Fallacy 2: Quality = Reliability

Fallacy 3: Users Know What They Want

Fallacy 4: The Requirements Will Be Correct

Fallacy 5: Users Will Accept a Boring Product if the Features and Reliability Are Good

Fallacy 6: Product Maturity Is Required

Категории