Quality and Quality Management Metrics
In procedural programming, quality is measured by defects per thousand LOC (KLOC), defects per function point, mean time to failure, and many other metrics and models such as those discussed in several previous chapters. The corresponding measure for defects per KLOC and defects per function point in OO is defects per class. In search of empirical data related to OO defect rates, we noted that data about OO quality is even more rare than productivity data. Table 12.7 shows the data that we tracked for some of the projects we discuss in this chapter.
Testing defect rates for these projects ranged from 0.21 defects per class to 0.69 per class and from 2.6 defects per KLOC (new and changed code) to 8.2 defects per KLOC. In our long history of defect tracking, defect rates during testing when the products were under development ranges from 4 defects per KLOC to about 9 defects per KLOC for procedural programming. The defect rates of these OO projects compare favorably with our history. With one year in the field, the defect rates of these products ranged from 0.01 defects per class to 0.05 defects per class and from 0.05 defects per KLOC to 0.78 defects per KLOC. Again, these figures, except the defects/KLOC for Project B, compare well with our history.
With regard to quality management, the OO design and complexity metrics can be used to flag the classes with potential problems for special attention, as is the practice at the NASA SATC. It appears that researchers have started focusing on the empirical validation of the proposed metrics and relating those metrics to managerial variables . This is certainly the right direction to strengthen the practical values of OO metrics. In terms of metrics and models for in-process quality management when the project is under development, we contend that most of the metrics discussed in this book are relevant to OO projects, for example, defect removal effectiveness, the inspection self-assessment checklist (Table 9.1), the software reliability growth models (Chapters 8 and 9), and the many metrics for testing (Chapter 10). Based on our experience, the metrics for testing apply equally well to OO projects. We recommend the following:
Table 12.7. Testing Defect Rate and Field Defect Rates for Some OO Projects
Project B (C++) |
Project C (C++) |
Project D (Smalltalk) |
Project F (Smalltalk) |
|
---|---|---|---|---|
Testing Defect Rate |
||||
Defects/Class |
0.21 |
0.82 |
0.27 |
0.69 |
Defects/KLOC |
3.1 |
8.2 |
2.6 |
5.9 |
Field Defect Rate (1 Year After Delivery) |
||||
Defects/Class |
0.05 |
na |
0.04 |
0.01 |
Defects/KLOC |
0.78 |
na |
0.41 |
0.05 |
- Test progress S curve
- Testing defect arrivals over time
- Testing defect backlog over time
- Number of critical problems over time
- Number of system crashes and hangs over time as a measure of system stability
- The effort/outcome paradigm for interpreting in-process metrics and for in-process quality management
Furthermore, for some simple OO metrics discussed in this chapter, when we put them into the context of in-process tracking and analysis (for example, trend charts ), they can be very useful in-process metrics for project and quality management. We illustrate this point below with the examples of a small project.
Project MMD was a small, independent project developed in Smalltalk with OO methodology and iterative development process over a period of 19 weeks (Hanks, 1998). The software provided functions to drive multimedia devices (e.g., audio and video equipment) and contained 40 classes with about 3,200 lines of code. The team consisted of four members ”two developers for analysis, design, and cod-ing, and two testers for testing, tracking, and other tasks required for the product to be ready to ship. With good in-process tracking and clear understanding of roles and responsibilities, the team conducted weekly status meetings to keep the project moving. Throughout the development process, four major iterations were completed. Figure 12.3 shows the trends of several design and code metrics over time. They all met the threshold values recommended by Lorenz (1993). Figure 12.4 shows the number of classes and classes discarded over time (i.e., metric number 11 in Table 12.1.). The trend charts reflect the several iterations and the fact that iterative development process was used.
Figure 12.3. Trends of Several OO Metrics
Figure 12.4. Class Statistics over Time
Figure 12.5 shows the relationship between defect arrivals and testing time, with a fitted curve based on the delayed S reliability growth model. This curve fitting confirms the applicability of reliability growth models to data from OO projects. In fact, we contend that this match may be even better than data from procedural software projects, because in OO environment, with the class structure, the more difficult bugs tend to be detected and "flushed" out earlier in the testing process.
Figure 12.5. OO Testing Defect Arrivals Follow the Pattern of a Software Reliability Growth Model
Finally, if we use this project as another data point for productivity estimates, with 40 classes and 76 person-weeks (4 x 19 weeks) and assuming 4.33 person-weeks per person-month (PM), we get 2.3 classes per PM. This number falls between the numbers for the framework-related projects (1.2 and 1.9 classes per PM) and the mature systems software projects (about 4 classes per PM).