Lean Sigma: A Practitioners Guide
31. Normality Test
Overview
The Normality Test is used on a sample of data to determine the likelihood of the population from which the sample originates being normally distributed. The result would be a degree of confidence in the population being normally distributed (a p-value). Normality is crucial for the majority of statistical tests examining the means and variances of samples. For example, if data becomes skewed (thus, non-normal), then the mean is probably not the best measure of center and a median-based test is probably better. The longer tail on the right of the example curve in Figure 7.31.1 drags the mean to the right; however, the median tends to remain constant. Figure 7.31.1. The effect of Normality on measures of center.
Roadmap
The hypotheses for a Normality Test are
The test is applied to a column of data (the sample) and the results obtained. Interpreting the Output
There are a number of Normality Tests, for example those listed in Minitab include (with simple descriptions in English):
To be candid, in the world of Process Improvement there really won't be a dramatic difference in conclusion based on the tests. It is advisable to stick with one and the default in Minitab is Anderson-Darling, so I personally tend to run with that one. Each test returns a test statistic, but the thing to be most interested in is the p-value, the likelihood that for the sample data a level of non-normality this large could have occurred purely by random chance even if the population were normally distributed. Output from a Normality Test is shown in Figure 7.31.2. Figure 7.31.2. Normality Test (Anderson-Darling) results for a sample of Bob's time to perform a task (output from Minitab v14).
The vertical scale on the graph is non-linear and the horizontal axis is a linear scale, similar to normal probability paper. If the data were perfectly normally distributed then the points would lay exactly on the line (and the p-value, in this case, theoretically should be 1.0). From the example results:
The hypotheses for a Normality Test are
The p-value should be interpreted in the usual way:
Therefore, for Bob's data, shown in Figure 7.31.2 with a p-value of 0.149, which is clearly greater than 0.05, the conclusion should be that Bob's data are normal. |
Категории