Identifying and Verifying Causes

Overview

Purpose of these tools

To increase the chances that you can identify the true root causes of problems, which can then be targeted for improvement.

The tools in this chapter fall into two very different categories:

  1. Tools for identifying potential causes (starts below) are techniques for sparking creative thinking about the causes of observed problems. The emphasis is on thinking broadly about what's going on in your process.
  2. Tools for verifying potential causes (starts on p. 149) are at the opposite end of the spectrum. Here the emphasis is on rigorous data analysis or specific statistical tests used to verify whether a cause-and-effect relationship exists and how strong it is.

A Identifying potential causes

Purpose of these tools

To help you consider a wide range of potential causes when trying to find explanations for patterns in your data.

They will help you…

Be sure to check the tools in part B to validate the suspected Xs.

Deciding which tool to use

This guide covers two types of tools used to identify potential causes:

Pareto charts

Highlights

To create a Pareto chart…

  1. Collect data on different types or categories of problems.
  2. Tabulate the scores. Determine the total number of problems observed and/or the total impact. Also determine the counts or impact for each category.

    • If there are a lot of small or infrequent problems, consider adding them together into an "other" category
  3. Sort the problems by frequency or by level of impact.
  4. Draw a vertical axis and divide into increments equal to the total number you observed.

    • In the example here, the total number of problems was 42, so the vertical axis on the left goes to 42
    • People often mistakenly make the vertical axis only as tall as the tallest bar, which can overemphasize the importance of the tall bars and lead to false conclusions
  5. Draw bars for each category, starting with the largest and working down.

    • The "other" category always goes last even if it is not the shortest bar
  6. OPTIONAL: Add in the cumulative percentage line. (Convert the raw counts to percentages of the total, then draw a vertical axis on the right that represents percentage. Plot a point above the first bar at the percentage represented by that bar, then another above the second bar representing the combined percentage, and so on. Connect the points.)
  7. Interpret the results (see next page).

Interpreting a Pareto chart

  1. Clear Pareto effect

    • This pattern shows that just a few categories of the problem account for the most occurrences or impact
    • Focus your improvement efforts on those categories

    Just a few categories account for ~80% of the count or impact

  2. No Pareto effect

    • This pattern shows that no cause you've identified is more important than any other

      • If working with counts or percentages, convert to an "impact" Pareto by calculating impacts such as "cost to fix" or "time to fix"
      • A pattern often shows up in impact that is not apparent by count or percentage alone

      Though some bars are taller than others, it takes a lot of categories to account for ~80% of the count or impact

  3. Revisit your fishbone diagram or list of potential causes, then…

    • Ask which factors could be contributing to all of the potential causes you've identified
    • Think about other stratification factors you may not have considered; collect additional data if necessary and create another Pareto based on the new stratification factor

  Tip 
  • The most frequent problems may not have the biggest impact in terms of quality, time, or costs. When possible, construct two Pareto charts on a set of data, one that uses count or frequency data and another that looks at impact (time required to fix the problem, dollar impact, etc.) You may end up targeting both the most frequent problems and the ones with the biggest impact.

    Category A errors happen a lot but don't take long to fix Category D errors are rare, but very expensive in terms of time

Whys

Highlights

To use 5 Whys…

  1. Select any cause (from a cause-and-effect diagram, or a tall bar on a Pareto chart). Make sure everyone has a common understanding of what that cause means. ("Why 1")
  2. Ask "why does this outcome occur"? (Why 2)
  3. Select one of the reasons for Why 2 and ask "why does that occur"? (Why 3)
  4. Continue in this way until you feel you've reached a potential root cause.

  Tips 
  • There's nothing sacred about the number 5. Sometimes you may reach a root cause after two or three whys, sometimes you may have to go more than five layers down.
  • Stop whenever you've reached a potential cause that the team can act on.

    • Ex: "Why are we late in delivery?" … Because the copier jams…"Why does the copier jam?" … Because of high humidity in the copier room … "Why does high humidity cause jams?" … Because the paper absorbs moisture and sticks together.

      (If you can't do anything about paper that absorbs moisture, go back to solving the problem of high humidity in the copier room—"What can we do to control or reduce humidity in the copier room"?)

Cause and effect diagrams (fishbone or Ishikawa diagrams)

Purpose

When to use cause and effect diagrams

How to create and use a cause and effect diagram

  1. Name the problem or effect of interest. Be as specific as possible.

    • Write the problem at the head of a fishbone "skeleton"
  2. Decide the major categories for causes and create the basic diagram on a flip chart or whiteboard.

    • Typical categories include the 6 Ms: manpower (personnel), machines, materials, methods, measurements, and Mother Nature (or environment)
  3. Brainstorm for more detailed causes and create the diagram.

    • Option 1: Work through each category, brainstorming potential causes and asking "why" each major cause happens. (See 5 Whys, p. 145).
    • Option 2: Do silent or open brainstorming (people come up with ideas in any order).
    • Write suggestions onto self-stick notes and arrange in the fishbone format, placing each idea under the appropriate categories.
  4. Review the diagram for completeness.

    • Eliminate causes that do not apply
    • Brainstorm for more ideas in categories that contain fewer items (this will help you avoid the "groupthink" effect that can sometimes limit creativity)
  5. Discuss the final diagram. Identify causes you think are most critical for follow-up investigation.

    • OK to rely on people's instincts or experience (you still need to collect data before taking action).
    • Mark the causes you plan to investigate. (This will help you keep track of team decisions and explain them to your sponsor or other advisors.)
  6. Develop plans for confirming that the potential causes are actual causes. DO NOT GENERATE ACTION PLANS until you've verified the cause.

C E Matrix

Purpose

To identify the few key process input variables that must be addressed to improve the key process output variable(s).

When to use a C E matrix

   

Temp of Coffee

Taste

Strength

 

Process Outputs

 

Importance

8

10

6

   

Process Steps

Process Inputs

Correlation of Input to Output

Total

           

0

Clean Carafe

 

[blank]

3

1

 

36

Fill Carafe with Water

   

9

9

 

144

Pour Water into Maker

   

1

1

 

16

Place Filter in Maker

   

3

1

 

36

How to create a C E matrix

  1. Identify key customer requirements (outputs) from the process map or Voice of the Customer (VOC) studies. (This should be a relatively small number, say 5 or fewer outputs.) List the outputs across the top of a matrix.
  2. Assign a priority score to each output according to importance to the customer.

    • Usually on a 1 to 10 scale, with 10 being most important
    • If available, review existing customer surveys or other customer data to make sure your scores reflect customer needs and priorities
  3. Identify all process steps and key inputs from the process map. List down the side of the matrix.
  4. Rate each input against each output based on the strength of their relationship:

    Blank = no correlation

    1 = remote correlation

    3 = moderate correlation

    9 = strong correlation

      Tip 

    At least 50% to 60% of the cells should be blank. If you have too many filled-in cells, you are likely forcing relationships that don't exist.

  5. Cross-multiply correlation scores with priority scores and add across for each input.

    Ex: Clean carafe = (3*10) + (1 * 6) = 30 + 6 = 36

  6. Create a Pareto chart and focus on the variables relationships with the highest total scores. Especially focus on those where there are acknowledged performance gaps (shortfalls).

B Confirming causal effects and results

Purpose of these tools

To confirm whether a potential cause contributes to the problem. The tools in this section will help you confirm a cause-and-effect relationship and quantify the magnitude of the effect.

Deciding between these tools

Often in the early stages of improvement, the problems are so obvious or dramatic that you don't need sophisticated tools to verify the impact. In such cases, try confirming the effect by creating stratified data plots (p. 150) or scatter plots (p. 154) of cause variables vs. the outcome of interest, or by testing quick fixes/obvious solutions (seeing what happens if you remove or change the potential cause, p. 152).

However, there are times when more rigor, precision, or sophistication is needed. The options are:

Stratified data charts

Highlights

To use stratified data charts…

  1. Before collecting data, identify factors that you think may affect the impact or frequency of problems

    • Typical factors include: work shift, supplier, time of day, type of customer, type of order. See stratification factors, p. 75, for details.
  2. Collect the stratification information at the same time as you collect the basic data
  3. During analysis, visually distinguish the "strata" or categories on the chart (see examples)

Option 1 Create different charts for each strata

   

Facility A

Facility B

Facility C

Time (in mins)

0-9

xxx

x

xx

10-19

xxxxx

xxxx

xxxxx

20-29

xxxx

xxxx

xxxxxxx

30-39

xxxxxx

xxxxx

xxxxxxxx

40-49

xxxx

xxxxxxx

xxxx

50-59

xxxx

xxxxxx

xx

60-69

xx

xxxx

x

70-79

x

xx

x

These stratified dot plots show the differences in delivery times in three locations. You'd need to use hypothesis testing to find out if the differences are statistically significant.

Option 2 Color code or use symbols for different strata

This chart uses symbols to show performance differences between people from different work teams. Training seems to have paid off for Team D (all its top performers are in the upper right corner); Team C has high performers who received little training (they are in the lower right corner).

Testing quick fixes or obvious solutions

Purpose

Why test quick fixes

When to test quick fixes

How to test quick fixes

  1. Confirm the potential cause you want to experiment with, and document the expected impact on the process output.
  2. Develop a plan for the experiment.

    • What change you will make
    • What data you will be measuring to evaluate the effect on the outcome
    • Who will collect data
    • How long the experiment will be run
    • Who will be involved (which team members, process staff, work areas, types of work items, etc.)
    • How you can make sure that the disruption to the workplace is minimal and that customers will not feel any effects from the experiment
  3. Present your plan to the process owner and get approval for conducting the experiment.
  4. Train data collectors. Alert process staff of the impending experiment; get their involvement if possible.
  5. Conduct the experiment and gather data.
  6. Analyze results and develop a plan for the next steps.

    • Did you conduct the experiment as planned?
    • Did making the process change have the desired impact on the outcome? Were problems reduced or eliminated?
    • If the problem was reduced, make plans for trying the changes on a larger scale (see pilot testing, p. 273)

  Tips 
  •   Note 

    Testing quick fixes is similar to doing a pilot test EXCEPT the purpose is to confirm a cause-and-effect relationship. You are not proposing a solution per se—you're doing a quick test to see if you've found a contributing cause. If the test shows an effect, continue with your regular procedures for planning and testing full-scale implementation.

  •   Caution 

    Do not confuse this testing with the kind of unplanned changes that often occur in the workplace. You need to approach quick fixes with an experimental mindset: predicting what changes you expect to see, planning specifically what changes to make, knowing what data you will collect to measure the effect, and so on.

  • Before the experiment, imagine that you have the results in hand and determine what type of analysis will be needed (confirm that you will get the type of data you need for the analysis).

Scatter plots

Highlights

To use scatter plots…

  1. Collect paired data

    To create a scatter plot, you must have two measurements for each observation point or item

    • Ex: in the chart above, the team needed to know both the call length and the broker's experience to determine where each point should go on the plot
  2. Determine appropriate measures and increments for the axes on the plot

    • Mark units for the suspected cause (input) on the horizontal X-axis
    • Mark the units for the output (Y) on the vertical Y-axis
  3. Plot the points on the chart

Interpreting scatter plot patterns

No pattern. Data points are scattered randomly in the chart.

Positive correlation (line slopes from bottom left to top right). Larger values of one variable are associated with larger values of the other variable.

Negative correlation (line slopes from upper left down to lower right). Larger values of one variable are associated with smaller values of the other variable.

Complex patterns. These often occur when there is some other factor at work that interacts with one of the factors. Multiple regression or design of experiments can help you discover the source of these patterns.

  Tips 
  • Use your SIPOC diagram (p. 38) to identify Xs and Ys.
  • By convention, scatter plots are used to compare an independent (X) variable (placed on the horizontal axis) and a dependent (Y) variable (on the vertical axis). But sometimes you may want to compare two input variables (Xs) or two output variables (Ys) to each other. In these cases, it doesn't matter which variable goes on the horizontal and which on the vertical axis.

Hypothesis testing overview

Highlights

Hypothesis testing terms and concepts

Uses for hypothesis testing

Assumptions of hypothesis tests

Confidence intervals

Calculating confidence intervals

The formulas for calculating confidence intervals are not included in this book because most people get them automatically from statistical software. What you may want to know is that the Z (normal) distribution is used when the standard deviation is known. Since that is rarely the case, more often the intervals are calculated from what's called a t—distribution. The t—distribution "relaxes" or "expands" the confidence intervals to allow for the uncertainty associated with having to use an estimate of the mean. (So a 95% confidence interval calculated with an unknown standard deviation will be wider than one where the standard deviation is known.)

Type I and Type II errors, Confidence, Power, and p values

Type I Error: Alpha (α) Risk or Producer risk

Type II Error: Beta (β) Risk or Consumer Risk

Balancing Alpha and Beta risks

p values

Confidence intervals and sample size

There is a direct correlation between sample size and confidence

t test Overview

Highlights

t Distribution

Sample t test

An automobile manufacturer has a target length for camshafts of 599.5 mm., with an allowable range of ± 2.5 mm (= 597.0 mm to 602.0 mm). Here are data on the lengths of camshafts from Supplier 2:

mean = 600.23

std. dev. = 1.87

95% CI for mean is 599.86 to 600.60

The null hypothesis in plain English: the camshafts from Supplier 2 are the same as the target value. Printouts from Minitab showing the results of this hypothesis test are shown on the next page.

One-Sample T: Supp2

Test of mu = 599.5 vs. not 599.5

Variable

N

Mean

StDev

SE Mean

95% CI

T

P

Supp2

100

600.230

1.874

0.187

(599.858, 600.602)

3.90

0.000

Confidence Intervals, Hypothesis Tests and Power

Results

Clues that we should reject the null hypothesis (which, for practical purposes, means the same as concluding that camshafts from Supplier 2 are not on target):

  1. On the histogram, the circle marking the target mean value is outside the confidence interval for the mean from the data
  2. The p-value is 0.00 (which is less than the alpha of 0.05)

Sample t test

Highlights

Using a 2 sample t test

Sample t test example

The same automobile manufacturer has data on another supplier and wants to compare the two:

The null hypothesis in plain English: the mean length of camshafts from Supplier 1 is the same as the mean length of camshafts from Supplier 2. Here is the printout from Minitab along with a boxplot:

Two-Sample T-Test and CI: Supp1, Supp2

Two-sample T for Supp1 vs Supp2

 

N

Mean

StDev

SE Mean

Supp1

100

599.548

0.619

0.062

Supp2

100

600.23

1.87

0.19

Difference = mu (Supp1) − mu (Supp2)

Estimate for difference:−0.682000

95% CI for difference: (−1.072751, −0.291249)

T-Test of difference = 0 (vs not =) : T-Value = −3.46 P-Value = 0.001 DF = 120

Confidence Intervals, Hypothesis Tests and Power

Results

There are two indicators in these results that we have to reject the null hypothesis (which, in practice, means concluding that the two suppliers are statistically different):

(Given the spread of values displayed on this boxplot, you may also want to test for equal variances.)

Overview of correlation

Highlights

The price of automobiles shows a negative correlation to gas mileage (meaning as price goes up, mileage goes down). But higher prices do not CAUSE lower mileage, nor does lower mileage cause higher car prices.

Correlation statistics (coefficients)

Regression analysis and other types of hypothesis tests generate correlation coefficients that indicate the strength of the relationship between the two variables you are studying. These coefficients are used to determine whether the relationship is statistically significant (translation: whether you can conclude that the observed relationships are not merely happening by chance). For example:

Interpreting correlation coefficients

Regression overview

Highlights

Regression Analysis is used in conjunction with correlation calculations and scatter plots to predict future performance based on past results.

Overview of regression analysis

  1. Plan data collection

    • What inputs or potential causes will you study?

      • Also called predictor variables or independent variables
      • Best if the variables are continuous, but they can be count or categorical
    • What output variable(s) are key?

      • Also called response or dependent variables
      • Best if the variables are continuous, but they can be count or categorical
    • How can you get data? How much data do you need?
  2. Perform analysis and eliminate unimportant variables

    • Collect the data and generate a regression equation:

      • Which input variables have the biggest effect on the response variable?
      • What factor or combination of factors is the best predictors of output?
    • Remember to perform residuals analysis (p. 195) to check if you can properly interpret the results
  3. Select and refine model

    • Delete unimportant factors from the model.
    • Should end up with to 2 or 3 factors still in the model
  4. Validate model

    Collect new data to see how well the model is able to predict actual performance

Simple linear regression

Highlights

Interpreting simple regression numbers

  Caution 

Be sure to perform residuals analysis (p. 195) as part of your work to verify the validity of the regression. If the residuals show unusual patterns, you cannot trust the results.

The graph shown on the previous page was generated to depict how the number of pizza deliveries affected how long customers had to wait. The form of the simple regression equation is:

The actual data showed

This means that, on average, customers have to wait about 32 minutes even when there are no deliveries in queue, and that (within the range of the study) each new delivery in queue adds just over half a minute (0.58 min) to the waiting time. The company can use this equation to predict wait time for customers. For example, if there are 30 deliveries in queue, the predicted wait time would be:

Multiple regression

Highlights

Interpreting multiple regression results

Below is the Minitab session output. The predictor equation proceeds the same as for simple regression (p. 168).

The regression equation is

Delivery Time = 30.5 + 0.343 Total Pizzas + 0.113 Defects − 0.010 Incorrect Order

Predictor

Coef

SE Coef

T

P

Constant

30.4663

0.7932

38.41

0.000

Total Pizzas

0.34256

0.0340

10.06

0.000

Defects

0.11307

0.0412

2.75

0.012

Incorrect Order

−0.0097

0.2133

−0.05

0.964

S = 1.102

R-Sq = 94.8%

R-Sq(adj) = 94.1%

The factors here mean:

R-squared is the amount of variation that is explained by the model. This model explains 94.8% of the variability in Pizza Delivery Time.

R-squared(adj) is the amount of variation that is explained by the model adjusted for the number of terms in the model and the size of the sample (more factors and smaller sample sizes increase uncertainty). In Multiple regression, you will use R-Sq(adj) as the amount of variation explained by the model.

S is the estimate of the standard deviation about the regression model. We want S to be as small as possible.

The P-values tell us that this must have been a hypothesis test.

H0: No correlation Ha: Correlation

If p < 0.05, then the term is significant (there is a correlation).

If a p-value is greater than 0.10, the term is removed from the model. A practitioner might leave the term in the model if the p-value is within the gray region between these two probability levels.

Output charts: Matrix plot and correlation matrix

These observations are confirmed by the correlation matrix (below). In the following example, the table shows the relationship between different pairs of factors (correlations tested among Total Pizzas, Defects, Incorrect Order, Delivery Time on a pairwise basis).

 

Total Pizzas

Defects

Incorrect Order

Defects

0.769

   
 

0.000

   

Incorrect

0.082

0.051

 

Order

0.695

0.807

 

Delivery

0.964

0.829

−0.057

 

0.000

0.000

0.787

In each pair of numbers:

  Caution 
  1. Relative importance of predictors cannot be determined from the size of their coefficients:

    • The coefficients are scale-dependent—they depend on the units and increments in the original data

      Ex: If Factor A has a coefficient of 5.0 and Factor B has a coefficient of 50, that does not mean that Factor B has ten times the impact of Factor A

    • The coefficients are influenced by correlation among the input variables
  2. At times, some of the Xs will be correlated with each other. This condition is known as multicollinearity, which causes:

    • Estimates of the coefficients to be unstable with inflated P-values
    • Difficulty isolating the effects of each X
    • Coefficients to vary widely depending on which Xs are included in the model

Use a metric called Variance Inflation Factor (VIF) to check for multicollinearity:

  • r2i is the r2 value from regressing Xi against the other Xs
  • A large r2i suggests that a variable is redundant

Rule of Thumb:

  • r2i > 0.9 is a cause for concern (VIF > 10; high degree of collinearity)
  • 0.8 < r2i < 0.9 will occur when VIF > 5; indicates a moderate degree of collinearity

If two predictor variables show multicollinearity, you need to remove one of them from the model.

  Tips 
  • Use a measurement selection matrix (p. 74) to help identify the multiple factors you want to study.
  • Gather enough observations to adequately measure error and check the model assumptions.
  • Make sure that the sample of data is representative of the population. (Need a valid sampling strategy.)
  • Excessive measurement error of the inputs (Xs) creates uncertainty in the estimated coefficients, predictions, etc. (Need an acceptable MSA.)
  • Be sure to collect data on all potentially important variables.
  • When you're deciding which inputs to include in the model, consider the time and effort of gathering the data on those additional variables.
  • Statistical software packages such as Minitab will usually help you find the best combination of variables (best subsets analysis). Rather than relying on the p-values alone, the computer looks at all possible combinations of variables and prints the resulting model characteristics.
  • When you have found the best subset, recalculate the regression equation with only those factors.
  • Validate the equation by collecting additional data.

ANOVA (ANalysis Of VAriance)

Purpose

To compare three or more samples to each other to see if any of the sample means is statistically different from the others.

When to use ANOVA

Overview of ANOVA

In the statistical world, inputs are sometimes referred to as factors. The samples may be drawn from several different sources or under several different circumstances. These are referred to as levels.

To tell whether the three or more options are statistically different, ANOVA looks at three sources of variability…

In One-Way ANOVA (below), we look at how different levels of a single factor affect a response variable.

In Two-Way ANOVA (p. 180), we examine how different levels of two factors and the interaction between those two factors affect a response variable.

One way ANOVA

A one-way ANOVA (involving just one factor) tests whether the mean (average) result of any alternative is different from the others. It does not tell us which one(s) is different. You'll need to supplement ANOVA with multiple comparison procedures to determine which means differ. A common approach for accomplishing this is to use Tukey's Pairwise comparison tests. (See p. 178)

Form of the hypotheses:

The comparisons are done through "sum of squares" calculations (shown here and depicted in the graph on the next page):

One way ANOVA Steps

  1. Select a sample size and factor levels.
  2. Randomly conduct your trials and collect the data.
  3. Conduct the ANOVA analysis (typically done through statistical software; see below for interpretation of results).
  4. Follow up with pairwise comparisons, if needed. If the ANOVA shows that at least one of the means is different, pairwise comparisons are done to show which ones are different.
  5. Examine the residuals, variance and normality assumptions.
  6. Generate main effects plots, interval plots, etc.
  7. Draw conclusions.

One way ANOVA reports

By comparing the Sums of Squares, we can tell if the observed difference is due to a true difference or random chance.

Interpreting the F-ratio

Checking for outliers

  Tip 
  • Be sure to perform a residuals analysis as well (see p. 195)

Invoice processing cycle time by Facility (One-way ANOVA)

One-way ANOVA: Order Processing Cycle Time versus Location

Analysis of Variance for Order Pr

Source

DF

SS

MS

F

P

Location

2

13.404

6.702

6.89

0.004

Error

27

26.261

0.973

   

Total

29

39.665

     

Individual 95% CIs For Mean Based on Pooled StDev

Level

N

Mean

StDev

—+

—+

—+

—+

CA

10

4.2914

0.6703

(—*—)

   

NY

10

5.2304

0.8715

 

(—*—)

 

TX

10

5.9225

1.3074

   

(—*—)

       

—+

—+

—+

—+

Pooled StDev = 0.9862

4.00

4.80

5.60

6.40

Conclusion: Because the p-value is 0.004, we can conclude that at least one of the facilities is statistically significantly different from the others, a message visually confirmed by the boxplot.

To tell which of the facilities is different, perform a Tukey Pairwise Comparisons, which provides confidence intervals for the difference between the tabulated pairs. Alpha is determined by the individual error rate—and will be less for the individual test than the alpha for the family. (See chart on next page.)

Tukey's pairwise comparisons

Family error rate = 0.0500

Individual error rate = 0.0196

Critical value = 3.51

Intervals for (column level mean) − (row level mean)

 

CA

NY

NY

−2.0337

 
 

0.1556

 

TX

−2.7258

−1.7867

 

−0.5364

0.4026

Degrees of Freedom

The number of independent data that go into an estimate of a parameter is called degrees of freedom (df), which is equal to the number of independent data that go into the estimate minus the number of parameters estimated. All intermediate steps in the estimation of the parameter must be included.

In ANOVA, the degrees of freedom are determined as follows:

ANOVA assumptions

  1. Model errors are assumed to be normally distributed with a mean of zero, and are to be randomly distributed
  2. The samples are assumed to come from normally distributed populations. Test this with residuals plots (see p. 195).
  3. Variance is assumed approximately constant for all factor levels

    • Minitab and other statistical software packages will perform both the Bartlet's (if data is normal) or Levine tests (if cannot assume normality) under options labeled Test for Equal Variances

      In this example, the p-values are very high, so we cannot reject the hypothesis that variance is the same for all the factors

  Practical Note 

Balanced designs (consistent sample size for all the different factor levels) are, in the language of statisticians, said to be "very robust to the constant variance assumption." That means the results will be valid even if variance is not perfectly constant. Still, make a habit of checking for constant variances. It is an opportunity to learn if factor levels have different amounts of variability, which is useful information.

Two way ANOVA

Same principles as one-way ANOVA, and similar Minitab output (see below):

Two Way ANOVA Reports

  1. Session window output

    Analysis of Variance for Order Processing time

    Source

    DF

    SS

    MS

    F

    P

    OrderTy

    1

    3.968

    3.968

    4.34

    0.048

    Location

    2

    13.404

    6.702

    7.34

    0.003

    Interaction

    2

    0.364

    0.182

    0.20

    0.821

    Error

    24

    21.929

    0.914

       

    Total

    29

    39.665

         

    As with other hypothesis tests, look at the p-values to make a judgment based on your chosen alpha level (typically .05 or .10) as to whether the levels of the factors make a significant difference.

  2. Main effects plots

    • These plots show the average or mean values for the individual factors being compared (you'll have one plot for every factor)
    • Differences between the factor levels will show up in "non-flat" lines: slopes going up or down or zig-zagging up and down

    • For example, the left side of the chart above shows that consumer orders process faster than commercial orders. The right side shows a difference in times between the three locations (California, New York, and Texas).
    • Look at p-values (in the Minitab session output, previous page) to determine if these differences are significant.
  3. Interaction plots

    • Show the mean for different combinations of factors
    • The example below, taken from a standard Minitab data set, shows a different pattern for each region (meaning the factors "act differently" at different locations:

      • In Region 1, color and plain packaging driver higher sales than point-of-sale displays
      • In Region 2, color and point-of-sale promotions have higher sales than color
      • Region 3 has lower overall sales; unlike in Region 1 and Region 2, color alone does not improves sales

Chi square test

Highlights

Form of the hypothesis

With the chi-square test for independence, statisticians assume most variables in life are independent, therefore:

If the p-value is < .05, then reject Ho

How to calculate chi square

  1. Identify different levels of both the X and Y variables

    • Ex: Supplier A vs. Supplier B, Pass or Fail
  2. Collect the data
  3. Summarize results in an observations table

    • Include totals for each column and row
    • The table here shows data on whether age (X) affected if a candidate was hired (Y)

       

      Hired

      Not Hired

      Total

      Old

      30

      150

      180

      Young

      45

      230

      275

      Totals

      75

      380

      455

  4. Develop an expected frequency table

    • For each cell in the table, multiply the column total by the Row total, then divide by the total number of observations

      Ex: in the table above, the "Old, Hired" cell has an expected frequency of: (75 * 180)/455 = 29.6%

    • For each cell, subtract the Actual number of observations from the expected frequency

      Ex: in the table above, the "Old, Hired" cell would be: 30 − 29.6 = 0.4

  5. Compute the relative squared differences

    • Square each figure in the table (negative numbers will become positive)

      Ex: 0.4 * 0.4 = 0.16

    • Divide by the expected number of observances for that cell

      Ex: 0.16/29.6 = .005

  6. Add together all the relative squared differences to get chi-square

    Ex: in the table on the previous page:

    Chi-square = x2 = 0.004 + 0.001 + 0.002 + 0.000 = 0.007

  7. Determine and interpret the p-value

    For this example: df = 1, p-value = 0.932

  Note 

Minitab or other statistical software will generate the table and compute the chi-square and p-values once you enter the data. All you need to do is interpret the p-value.

  Tip 
  • Your data should have been gathered to ensure randomness. Beware of other hidden factors (Xs).

Design of Experiments (DOE) notation and terms

Response Variable—An output which is measured or observed.

Factor—A controlled or uncontrolled input variable.

Fractional Factorial DOE—Looks at only a fraction of all the possible combinations contained in a full factorial. If many factors are being investigated, information can be obtained with smaller investment. See p. 190 for notation.

Full Factorial DOE—Full factorials examine every possible combination of factors at the levels tested. The full factorial design is an experimental strategy that allows us to answer most questions completely. The general notation for a full factorial design run at 2 levels is: 2k = # Runs.

Level—A specific value or setting of a factor.

Effect—The change in the response variable that occurs as experimental conditions change.

Interaction—Occurs when the effect of one factor on the response depends on the setting of another factor.

Repetition—Running several samples during one experimental setup run.

Replication—Replicating (duplicating) the entire experiment in a time sequence with different setups between each run.

Randomization—A technique used to spread the effect of nuisance variables across the entire experimental region. Use random numbers to determine the order of the experimental runs or the assignment of experimental units to the different factor-level combinations.

Resolution—how much sensitivity the results have to different levels of interactions.

Run—A single setup in a DOE from which data is gathered. A 3-factor full factorial DOE run at 2 levels has 23 = 8 runs.

TrialSee Run

Treatment CombinationSee Run

Design terminology

In most software programs, each factor in the experiment will automatically be assigned a letter: A, B, C, etc.

Interaction effects are labeled with the letters of the corresponding factors:

  Tip 

It's common to find main effects and second-order effects (the interaction of one factor with another) and not unusual to find third-order effects in certain types of experiments (such as chemical processes). However, it's rare that interactions at a higher order are significant (this is referred to as "Sparsity of Effects"). Minitab and other programs can calculate the higher-order effects, but generally such effects are of little importance and are ignored in the analysis.

Planning a designed experiment

Design of Experiments is one of the most powerful tools for understanding and reducing variation in any process. DOE is useful whenever you want to:

Developing an experimental plan

  1. Define the problem in business terms, such as cost, response time, customer satisfaction, service level.
  2. Identify a measurable objective that you can quantify as a response variable. (see p. 187)

    • Ex: Improve the yield of a process by 20%
    • Ex: Achieve a quarterly target in quality or service level
  3. Identify input variables and their levels (see p. 187).
  4. Determine the experimental strategy to be used:

    • Determine if you will do a few medium to large experiments or several smaller experiments that will allow quick cycles of learning
    • Determine whether you will do a full factorial or fractional factorial design (see p. 189)
    • Use a software program such as Minitab or other references to help you identify the combinations of factors to be tested and the order in which they will be tested (the "run order")
  5. Plan the execution of all phases (including a confirmation experiment):

    • What is the plan for randomization? replication? repetition?
    • What if any restrictions are there on randomization (factors that are difficult/impossible to randomize)?
    • Have we talked to internal customers about this?
    • How long will it take? What resources will it take?
    • How are we going to analyze the data?
    • Have we planned a pilot run?
    • Make sure sufficient resources are allocated for data collection and analysis
  6. Perform an experiment and analyze the results. What was learned? What is the next course of action? Carry out more experimentation or apply knowledge gained and stabilize the process at the new level of performance.

Defining response variables

Identifying input variables

Review your process map or SIPOC diagram and/or use cause identification methods (see pp. 145 to 155) to identify factors that likely have an impact on the response variable. Classify each as one of the following:

  1. Controllable factor (X)—Factors that can be manipulated to see their effect on the outputs.

    • Ex: Quantitative (continuous): temperature, pressure, time, speed
    • Ex: Qualitative (categorical): supplier, color, type, method, line, machine, catalyst, material grade/type
  2. Constant (C) or Standard Operating Procedure (SOP)—Procedures that describe how the process is run and identify certain factors which will be held constant, monitored, and maintained during the experiment.
  3. Noise factor (N)—Factors that are uncontrollable, difficult or too costly to control, or preferably not controlled. Decide how to address these in your plans (see details below).

    • Ex: weather, shift, supplier, user, machine age, etc.

Selecting factors

Consider factors in the context of whether or not they are:

  1. Practical

    • Does it make sense to change the factor level? Will it require excessive effort or cost? Would it be something you would be willing to implement and live with?

      • Ex: Don't test a slower line speed than would be acceptable for actual production operations
      • Ex: Be cautious in testing changes in a service factor that you know customers are happy with
  2. Feasible

    • Is it physically possible to change the factor?

      • Ex: Don't test temperature levels in the lab that you know can't be achieved in the factory
  3. Measurable

    • Can you measure (and repeat) factor level settings?

      • Ex: Operator skill level in a manufacturing process
      • Ex: Friendliness of a customer service rep

  Tips for treating noise factors 

A noise (or nuisance) factor is a factor beyond our control that affects the response variable of interest.

  • If the noise factor definitely affects the response variable of interest and is crucial to the process, product, or service performance (such as raw materials)…

    • Incorporate it into the experimental design
    • Limit the scope of the experiment to one case (or level) of the noise factor
  • If the noise factor is completely random and uncontrollable (weather, operator differences, etc.), then randomize the runs to keep it from invalidating the experiment
  • When possible, hold the noise factors constant during the course of the experiment
  Tips for selecting factors 
  • Look for low-hanging fruit

    • High potential for significant impact on key measures
    • No or low cost
    • Easy to implement and change
  • Additional items to consider:

    • Cost-effectiveness
    • Manageability
    • Resources
    • Potential for interactions
    • Time
    • How many ideas you generate

DOE Full factorial vs Fractional factorials (and notations)

Full factorial experiments

Fractional factorial experiments

Loss of resolution with fractional factorials

This experiment will test 4 factors at each of 2 levels, in a half-fraction factorial (24 would be 16 runs, this experiment is the equivalent of 23 = 8 runs).

The resolution of IV means:

Interpreting DOE results

Most statistical software packages will give you results for main effects, interactions, and standard deviations.

  1. Main effects plots for mean

    • Interpretation of slopes is all relative. Lines with steeper slopes (up or down) have a bigger impact on the output means than lines with little or no slope (flat or almost flat lines).

    • In this example, the line for shelf placement slopes much more steeply than the others—meaning it has a bigger effect on sales than the other factors. The other lines seem flat or almost flat, so the main effects are less likely to be significant.
  2. Main effects plots for standard deviation

    • These plots tell you whether variation changes or is the same between factor levels.
    • Again, you want to compare slopes in comparison to each other. Here, Design has much more variation one level than at the factors (so you can expect it to have much more variation at one level than at the other level).
  3. Pareto chart of the means for main factor effects and higher-order interactions

    • You're looking for individual factors (labeled with a single letter) and interactions (labeled with multiple letters) that have bars that extend beyond the "significance line"
    • Here, main factor A and interaction AB have significant effects, meaning placement, and interaction of placement and color have the biggest impact on sales (compare to the "main effects plot for mean," previous page).
  4. Pareto chart on the standard deviation of factors and interactions

    • Same principle as the Pareto chart on means

    • Here, only Factor C (Design) shows a significant change in variation between levels
  5. Minitab session window reports

    • Shelf Placement and the Shelf Placement* Color interactions are the only significant factors at a 90% confidence internal (if alpha were 0.05 instead of 0.10, only placement would be significant)

      Fractional Factorial Fit: Sales versus Shelf Placem, Color, Design, Text

      Term

      Effect

      Coef

      SE Coef

      T

      P

      Constant

      128.50

      0.2500

      514.00

      0.001

       

      Shelf PI

      −38.50

      −19.25

      0.2500

      −77.00

      0.008

      Color

      2.00

      1.00

      0.2500

      4.00

      0.156

      Design

      0.50

      0.25

      0.2500

      1.00

      0.500

      Text

      −0.00

      −0.00

      0.2500

      −0.00

      1.000

      Shelf PI*Color

      3.50

      1.75

      0.2500

      7.00

      0.090

      Shelf PI*Design

      −3.00

      −1.50

      0.2500

      −6.00

      0.105

      Analysis of Variance for Sales (coded units)

      Source

      DF

      Seq SS

      Adj SS

      Adj MS

      F

      P

      Main Effects

      4

      2973.00

      2973.00

      743.250

      1E+03

      0.019

      2-Way Interactions

      2

      42.50

      42.50

      21.250

      42.50

      0.108

      Residual Error

      1

      0.50

      0.50

      0.500

         

      Total

      7

      3016.00

             
    • Design is the only factor that has a significant effect on variation at the 90% confidence level

      Fractional Factorial Fit: Std Dev versus Shelf Placement, Color,…

      Term

      Effect

      Coef

      SE Coef

      T

      P

      Constant

      9.0000

      0.2500

      36.00

      0.018

       

      Shelf PI

      −1.5000

      −0.7500

      0.2500

      −3.00

      0.205

      Color

      −0.0000

      −0.0000

      0.2500

      −0.00

      1.000

      Design

      6.5000

      3.2500

      0.2500

      13.00

      0.049

      Text

      1.0000

      0.5000

      0.2500

      2.00

      0.295

      Shelf PI*Color

      0.5000

      0.2500

      0.2500

      1.00

      0.500

      Shelf PI*Design

      0.0000

      0.0000

      0.2500

      0.00

      1.000

      Analysis of Variance for Std (coded units)

      Source

      DF

      Seq SS

      Adj SS

      Adj MS

      F

      P

      Main Effects

      4

      91.0000

      91.0000

      22.7500

      45.50

      0.111

      2-Way Interactions

      2

      0.5000

      0.5000

      0.2500

      0.50

      0.707

      Residual Error

      1

      0.5000

      0.5000

      0.5000

         

      Total

      7

      92.0000

             

Residual analysis in hypothesis testing

Highlights

If data points hug the diagonal line, the data are normally distributed

Want to see a similar spread of points across all values (which indicates equal variance)

Histograms provide a visual check of normality

The number of data points here makes this chart difficult to analyze, but the principles are the same as those for time series plots

Interpreting the results

The plots are usually generated in Minitab or other statistical package. The interpretation is based on the following assumptions:

Examine the plots as you would any plot of the varying styles (regression plot, histogram, scatter plot, etc.).

  Practical Note 

Moderate departures from normality of the residuals are of little concern. We always want to check the residuals, though, because they are an opportunity to learn more about the data.

Категории