Implementation
Implementation of the Rayleigh model is not difficult. If the defect data (defect counts or defect rates) are reliable, the model parameters can be derived from the data by computer programs (available in many statistical software packages) that use statistical functions. After the model is defined, estimation of end-product reliability can be achieved by substitution of data values into the model.
Figure 7.5 shows a simple example of implementation of the Rayleigh model in SAS, which uses the nonlinear regression procedure. From the several methods in nonlinear regression, we chose the DUD method for its simplicity and efficiency (Ralston and Jennrich, 1978). DUD is a derivative-free algorithm for nonlinear least squares. It competes favorably with even the best derivative-based algorithms when evaluated on a number of standard test problems.
Figure 7.5 An SAS Program for the Rayleigh Model
/*****************************************************************/ /* */ /* SAS program for estimating software latent-error rate based */ /* on the Rayleigh model using defect removal data during */ /* development */ /* */ /* ------------------------------------------------------------- */ /* */ /* Assumes: A 6-phase development process: High-level design(I0)*/ /* Low-level design (I1), coding(I2), Unit test (UT), */ /* Component test (CT), and System test (ST). */ /* */ /* Program does: */ /* 1) estimate Rayleigh model parameters */ /* 2) plot graph of Rayleigh curve versus actual defect rate */ /* on a GDDM79 terminal screen (e.g., 3279G) */ /* 3) perform chi-square goodness-of-fit test, indicate */ /* whether the model is adequate or not */ /* 4) derive latent error estimate */ /* */ /* User input required: */ /* A: input defect rates and time equivalents of */ /* the six development phases */ /* B: initial values for iteration */ /* C: defect rates */ /* D: adjustment factor specific to product/development */ /* site */ /* */ /*****************************************************************/ TITLE1 'RAYLEIGH MODEL - DEFECT REMOVAL PATTERN'; OPTIONS label center missing=0 number linesize=95; /*****************************************************************/ /* */ /* Set label value for graph */ /* */ /*****************************************************************/ proc format; value jx 0='I0' 1='I1' 2='I2' 3='UT' 4='CT' 5='ST' 6='GA' 7=' ' ; /*****************************************************************/ /* */ /* Now we get input data */ /* */ /*****************************************************************/ data temp; /*---------------------------------------------------------------*/ /* INPUT A: */ /* In the INPUT statement below, Y is the defect removal rate */ /* per KLOC, T is the time equivalent for the development */ /* phases: 0.5 for I0, 1.5 for I1, 2.5 for I2, 3.5 for UT, */ /* 4.5 for CT, and 5.5 for ST. */ /* Input data follows the CARDS statement. */ /*---------------------------------------------------------------*/ INPUT Y T; CARDS; 9.2 0.5 11.9 1.5 16.7 2.5 5.1 3.5 4.2 4.5 2.4 5.5 ; /*****************************************************************/ /* */ /* Now we estimate the parameters of the Rayleigh distribution */ /* */ /*****************************************************************/ proc NLIN method=dud outest=out1; /*---------------------------------------------------------------*/ /* INPUT B: */ /* The non-linear regression procedure requires initial input */ /* for the K and R parameters in the PARMS statement. K is */ /* the defect rate/KLOC for the entire development process, R is */ /* the peak of the Rayleigh curve. NLIN takes these initial */ /* values and the input data above, goes through an iteration */ /* procedure, and comes up with the final estimates of K and R. */ /* Once K and R are determined, we can specify the entire */ /* Rayleigh curve, and subsequently estimate the latent-error */ /* rate. */ /*---------------------------------------------------------------*/ PARMS K=49.50 to 52 by 0.1 R=1.75 to 2.00 by 0.01; *bounds K<=50.50,r>=1.75; model y=(1/R**2)*t*K*exp((-1/(2*r**2))*t**2); data out1; set out1; if _TYPE_ = 'FINAL'; proc print dana=out1; /*****************************************************************/ /* */ /* Now we prepare to plot the graph */ /* */ /*****************************************************************/ /*---------------------------------------------------------------*/ /* Specify the entire Rayleigh curve based on the estimated */ /* parameters */ /*---------------------------------------------------------------*/ data out2; set out1; B=1/(2*R**2); do I=l to 140; J=I/20; RAY=exp(-B*(J-0.05)**2) - exp(-B*J**2); DEF=ray*K*20; output ; end; label DEF='DEFECT RATE'; /*---------------------------------------------------------------*/ /* INPUT C: */ /* Prepare for the histograms in the graph, values on the right */ /* hand side of the assignment statements are the actual */ /* defect removal rates--same as those for the INPUT statement */ /*---------------------------------------------------------------*/ data out2 ; set out2; if 0<=J<1 then DEF1=9.2 ; if 1<=J<2 then DEF1=l1.9 ; if 2<=J<3 then DEF1=16.7 ; if 3<=J<4 then DEF1-5.1 ; if 4<=J<5 then DEF1=4.2 ; if 5<=j<=6 then DEF1=2.4 ; label J='DEVELOPMENT PHASES'; ; /*****************************************************************/ /* */ /* Now we plot the graph on a GDDM79 terminal screen(e.g., 3279G)*/ /* The graph can be saved and plotted out through graphics */ /* interface such as APGS */ /* */ /*****************************************************************/ goptions device=GDDM79; * GOPTIONS DEVICE=GDDMfam4 GDDMNICK=p3820 GDDMTOKEN=img240x HSIZE=8 VSIZE=11; * OPTIONS DEVADDR=(.,.,GRAPHPTR); proc gplot data=out2; plot DEF*J DEF1*J/overlay vaxis=0 to 25 by 5 vminor=0 fr hminor=0; symbol1 i=joint v=none c=red; symbol2 i=needle v=none c=green; formal j jx.; /*****************************************************************/ /* Now we compute the chi-square goodness-of-fit test */ /* Note that the CDF should be used instead of */ /* the PDF. The degree of freedom is */ /* n-l-#parameters, in this case, n-1-2 */ /* */ /*****************************************************************/ data out1; set out1; DO i=1 to 6; OUTPUT; END; keep K R; data temp2; merge out1 temp; T=T + 0.5; T_1 = T-l; b=1/(R*R*2); E_rate = K*(exp(-b*T_I*T_1) - exp(-b*T*T)); CHI_sq = ( y - E_rate)**2 / E_rate; proc sort data=temp2; by T; data temp2; set temp2; by T; if T=1 then T_chisq = 0; T_chisq + CHI_sq; proc sort data=temp2; by K T; data temp3; set temp2; by K T; if LAST.K; df = T-1-2; p= 1- PROBCHI(T_chisq, df); IF p>0.05 then RESULT='Chi-square test indicates that model is adequate. '; ELSE RESULT='Chi-square test indicates that model is inadequate. ' ; keep T_chisq df p RESULT; proc print data=temp3; /*****************************************************************/ /* INPUT D - the value of ADJUST */ /* Now we estimate the latent-error rate. The Rayleigh model */ /* is known to under-estimate. */ /* To have good predictive validity, it */ /* is important to use an adjustment factor based on the */ /* prior experience of your product. */ /*****************************************************************/ data temp4; set temp2; by K T; if LAST.K; ADJUST = 0.15; E_rate = K*exp(-b*T*T); Latent= E_rate + ADJUST; label Latent = 'Latent Error Rate per KCSI'; keep Latent; proc print data=temp4 label; RUN; CMS FILEDEF * CLEAR ; ENDSAS;
The SAS program estimates model parameters, produces a graph of fitted model versus actual data points on a GDDM79 graphic terminal screen (as shown in Figure 7.2), performs chi square goodness-of-fit tests, and derives estimates for the latent-error rate. The probability ( p value) of the chi square test is also provided. If the test results indicate that the fitted model does not adequately describe the observed data ( p > .05), a warning statement is issued in the output. If proper graphic support is available, the colored graph on the terminal screen can be saved as a file and plotted via graphic plotting devices.
In the program of Figure 7.5, r represents t m as discussed earlier. The program implements the model on a six-phase development process. Because the Rayleigh model is a function of time (as are other reliability models), input data have to be in terms of defect data by time. The following time equivalent values for the development phases are used in the program:
I0 ” 0.5
I1 ” 1.5
I2 ” 2.5
UT ” 3.5
CT ” 4.5
ST ” 5.5
Implementations of the Rayleigh model are available in industry. One such example is the Software LIfe-cycle Model tool (SLIM) developed by Quantitative Software Management, Inc., of McLean, Virginia. SLIM is a software product designed to help software managers estimate the time, effort, and cost required to build medium and large software systems. It embodies the software life-cycle model developed by Putnam (Putnam and Myers, 1992), using validated data from many projects in the industry. Although the main purpose of the tool is for life-cycle project management, estimating the number of software defects is one of the important elements. Central to the SLIM tool are two important management indicators. The first is the productivity index (PI), a "big picture" measure of the total development capability of the organization. The second is the manpower buildup index (MBI), a measure of staff buildup rate. It is influenced by scheduling pressure, task concurrency, and resource constraints. The inputs to SLIM include software size (lines of source code, function points, modules, or uncertainty), process productivity (methods, skills, complexity, and tools), and management constraints (maximum people, maximum budget, maximum schedule, and required reliability). The outputs from SLIM include the staffing curve, the cumulative cost curve over time, probability of project success over time, reliability curve and the number of defects in the product, along with other metrics. In SLIM the X -axis for the Rayleigh model is in terms of months from the start of the project.
As a result of Gaffney's work (1984), in 1985 the IBM Federal Systems Division at Gaithersburg, Maryland, developed a PC program called the Software Error Estimation Reporter (STEER). The STEER program implements a discrete version of the Rayleigh model by matching the input data with a set of 11 stored Rayleigh patterns and a number of user patterns. The stored Rayleigh patterns are expressed in terms of percent distribution of defects for the six development phases mentioned earlier. The matching algorithm involves taking logarithmic transformation of the input data and the stored Rayleigh patterns, calculating the separation index between the input data and each stored pattern, and choosing the stored pattern with the lowest separation index as the best-fit pattern.
Several questions arise about the STEER approach. First, the matching algorithm is somewhat different from statistical estimation methodologies, which derive estimates of model parameters directly from the input data points based on proved procedures. Second, it always produces a best-match pattern even when none of the stored patterns is statistically adequate to describe the input data. There is no mention of how little of the separation index indicates a good fit. Third, the stored Rayleigh patterns are far apart; specifically , they range from 1.00 to 3.00 in terms of t m , with a huge increment of 0.25. Therefore, they are not sensitive enough for estimating the latent-error rate, which is usually a very small number.
There are, however, circumventions to the last two problems. First, use the separation index conservatively; be skeptical of the results if the index exceeds 1.00. Second, use the program iteratively: After selecting the best-match pattern (for instance, the one with t m = 1.75), calculate a series of slightly different Rayleigh patterns that center at the best-match pattern (for instance, patterns ranging from t m = 1.50 to t m = 2.00, with an increment of 0.05 or 0.01), and use them as user patterns to match with the input data again. The outcome will surely be a better "best match."
When used properly, the first two potential weak points of STEER can become its strong points. In other words, STEER plays down the role of formal parameter estimation and relies heavily on matching with existing patterns. If the feature of self-entered user patterns is used well (e.g., use defect patterns of projects from the same development organizations that have characteristics similar to those of the project for which estimation of defects is sought), then empirical validity is established. From our experience in software reliability projection, the most important factor in achieving predictive validity, regardless of the model being used, is to establish empirical validity with historical data.
Table 7.2 shows the defect removal patterns of a number of projects, the defect rates observed during the first year in the field, the life-of-product (four years ) projection based on the first-year data, and the projected total latent defect rate (life-of-product) from STEER. The data show that the STEER projections are very close to the LOP projections based on one year of actual data. One can also observe that the defect removal patterns and the resulting field defects lend support to the basic assumptions of the Rayleigh model as discussed earlier. Specifically, more front-loaded defect patterns lead to lower field defect rates and vice versa.
Table 7.2. Defect Removal Patterns and STEER Projections
Defects Per KLOC |
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Project |
LOC |
Language |
High-Level Design |
Low-Level Design |
Code |
Unit Test |
Integration Test |
System Test |
First-Year Field Defect |
LOP Field Defect |
STEER Estimate |
A |
680K |
Jovial |
4 |
” |
13 |
5 |
4 |
2 |
0.3 |
0.6 |
0.6 |
B |
30K |
PL/1 |
2 |
7 |
14 |
9 |
7 |
” |
3.0 |
6.0 |
6.0 |
C |
70K |
BAL |
6 |
25 |
6 |
3 |
2 |
0.5 |
0.2 |
0.4 |
0.3 |
D |
1700K |
Jovial |
4 |
10 |
15 |
4 |
3 |
3 |
0.4 |
0.8 |
0.9 |
E |
290K |
ADA |
4 |
8 |
13 |
” |
8 |
0.1 |
0.3 |
0.6 |
0.7 |
F |
70K |
” |
1 |
2 |
4 |
6 |
5 |
0.9 |
1.1 |
2.2 |
2.1 |
G |
540K |
ADA |
2 |
5 |
12 |
12 |
4 |
1.8 |
0.6 |
1.2 |
1.1 |
H |
700K |
ADA |
6 |
7 |
14 |
3 |
1 |
0.4 |
0.2 |
0.4 |
0.4 |