Defect Removal Effectiveness and Process Maturity Level
Based on a special study commissioned by the Department of Defense, Jones (Software Productivity Research, 1994; Jones, 2000) estimates the defect removal effectiveness for organizations at different levels of the development process capability maturity model (CMM):
Level 1: 85%
Level 2: 89%
Level 3: 91%
Level 4: 93%
Level 5: 95%
These values can be used as comparison baselines for organizations to evaluate their relative capability with regard to this important parameter.
In a discussion on quantitative process management (a process area for Capability Maturity Model Integration, CMMI, level 4) and process capability baselines, Curtis (2002) shows the estimated baselines for defect removal effectiveness by phase of defect insertion (or defect origin in our terminology). The cumulative percentages of defects removed up through acceptance test (the last phase before the product is shipped) by phase insertion, for CMMI level 4, are shown in Table 6.4. Based on historical and recent data from three software engineering organizations at General Dynamics Decision Systems, Diaz and King (2002) report that the phase containment effectiveness by CMM level as follows :
Level 2: 25.5%
Level 3: 41.5%
Level 4: 62.3%
Level 5: 87.3%
Table 6.4. Cumulative Percentages of Defects Removed by Phase for CMMI Level 4
Phase Inserted
Cumulative % of Defects Removed Through Acceptance Test
Requirements
94%
Top-level design
95%
Detailed design
96%
Code and unit test
94%
Integration test
75%
System test
70%
Acceptance test
70%
It is not clear how many key phases are there in the development process for these projects and the extent of variations in containment effectiveness across phases. It appears that these statistics represent the average effectiveness for peer reviews and testing for a number of projects at each maturity level. Therefore, these statistics perhaps could be roughly interpreted as overall inspection effectiveness or overall test effectiveness.
According to Jones (2000), in general, most forms of testing are less than 30% efficient. The cumulative efficiency of a sequence of test stages, however, can top 80%.
These findings demonstrate a certain level of consistency among each other and with the example in Figure 6.4. The Figure 6.4 example is based on a real-life project. There was no process maturity assessment conducted for the project but the process was mature and quantitatively managed. Based on the key process practices and the excellent field quality results, the project should be at level 4 or level 5 of a process maturity scale.
More empirical studies and findings on this subject will surely produce useful knowledge. For example, test effectiveness and inspection effectiveness by process maturity, characteristics of distributions at each maturity level, and variations across the type of software are all areas for which reliable benchmark baselines are needed.
Recommendations for Small Organizations
Defect removal effectiveness is a direct indicator of the capability of a software development process in removing defects before the software is shipped. It is one of few, perhaps the only, process indicators that bear a direct correlation with the quality of the software's field performance. In this chapter we examine several aspects of defect removal effectiveness including (1) overall effectiveness, (2) inspection effectiveness, (3) test effectiveness, (4) phase-specific effectiveness, and (5) the role of defect removal effectiveness in quality planning. For small organizations starting a metrics program with constrained resources, I recommend the following approach:
Start with the overall defect removal effectiveness indicator and the test effectiveness indicator.
Assess the stability or variations of these indicators across projects and implement systematic actions to improve them.
Compare with industry baselines (such as those discussed in section 6.5) to determine the organization's process maturity level with regard to this parameter. Because defect removal effectiveness is a relative percentage measure, comparisons across teams and organizations is possible.
Start to examine inspection effectiveness also, and loop back to step 2 for continuous improvement.
If a tracking system is established, then gather data about defect origins (in addition to where defects were found) and start implementing phase-specific effectiveness metrics to guide specific improvements.
The combination of where-found and defect-origin data is useful for many occasions, not just for phase effectiveness calculation. For example, defect cause analysis including phase origin during the testing phase of the project often provides clues for further actions before shipping the product. However, tracking both wherefound and phase-origin data for all defects does require additional resources.
If resources are severely constrained and data tracking for the front-end of the process (e.g., requirements, design reviews, and code inspections) is not available, then use the test effectiveness metric. But in terms of improvement actions, a focus on the entire development process is important and strongly recommended. Indeed, even for organizations with more mature metrics programs, tracking and data for the front-end of the development process normally are less rigorous than the testing phases. For small organizations starting a metrics program with minimum resources and with the intent to keep the number of metrics to the minimum, using the test effectiveness indicator for measurement while maintaining the overall process focus on requirements, design, coding, and testing is a good strategy. With or without metrics, the importance of a strong focus on requirements, designs, and reviews can never be overstated. Good requirements-gathering and analysis techniques can reduce the volume of requirements changes and secondary requirements significantly.
Good design and programming practices with effective defect prevention and removal are crucial for the stability and predictability of the back end of the development process. Projects that bypass design and code inspections may seem to be ahead of schedule ”until testing begins. When testing begins, a deluge of unexpected errors could bring the project to a standstill, and the development team can get locked into a cycle of finding bugs , attempting to fix them, and retesting that can stretch out for months.