Six Sigma and Beyond: Statistics and Probability, Volume III

In this appendix, we consider a particular random variable that is a generalization of the coin-tossing random variable. Toward this objective, consider an experiment in which the outcome can always be classified as a success or a failure. In the coin-tossing experiment, success would correspond to getting a head, and failure would correspond to getting a tail. Let p denote the probability of getting a success and let q = 1 - p denote the failure probability, Further, let the experiment be repeated n times, and let x denote the total number of successes that will be obtained in the n repetitions of the experiment. In the coin- tossing experiment, for example, we would have p = q = 1/2 and n = 3. In terms of this notation the basic problem is to find the probability distribution of the random variable x.

It is assumed in problems of this type that the n experiments are independent in a probability sense and, therefore, that the multiplication rule for independent events may be applied to them. The n independent repetitions of the experiment are usually called the n trials of the experiment. A sequence of independent trials such as this in which the probability of success is the same for all trials is called a sequence of Bernoulli trials. The name is in honor of a Swiss mathematician who pioneered in the study of probability. The coin-tossing experiment is an illustration of a sequence of three Bernoulli trials for which the probability of success in any given trial is 1/2.

The technique for finding the probability distribution of x is a generalization of that used in Chapter 16 for the coin-tossing problem. We first calculate the probabilities for all possible sequences of outcomes and then add the probabilities of those sequences that yield the same value of x. Suppose, for example, that we wish to calculate P{k}, where this symbol denotes the probability that the random variable x will assume the value k, and where k is some integer between 0 and n. One possible sequence that will make x = k is the following one in which all the successes occur first, followed by all failures:

Another such sequence is the following one in which a failure occurs first, followed by k consecutive successes, then followed by the remaining failures. Thus

Because of the independence of the trials, the probability of obtaining the first of these two sequences is given by

The probability for the second sequence is given by

The probability for the two sequences is the same and clearly will be the same for every sequence that satisfies the condition of having k successes and n - k failures.

The number of ways in which the desired event can occur is equal to the number of different sequences that can be written down of the type just displayed, those containing k letters S and n - k letters F. But this number is equal to the number of ways of choosing k positions out of n positions along a line in which to place the letter S. The remaining n - k positions will automatically be assigned the letter F. Since we are interested only in which of the n positions are to be selected and not in the order in which we choose them, this is a combination problem of choosing k things from n things. From formula

now we can determine the number of such sequences.

Since each of these sequences represents one of the mutually exclusive ways in which the desired event can occur, and each such sequence has the same probability of occurring, namely p k q n-k , it follows that the desired probability is obtained by adding this probability as many times as there are sequences. But the number of sequences was just found to be ; therefore, P{k} is obtained by multiplying p k q n-k by . Hence:

This formula gives the probability of x = k, that is, of obtaining k successes in n Bernoulli trials for which the probability of success in a single trial is p. The random variable x is commonly called the binomial variable, and the above formula is a formula for the binomial distribution.

Although the problems used to motivate this derivation have been related here to games of chance, there are many types of practical problems that can be solved by means of the binomial distribution formula. We consider only a few simple problems that require little computation to illustrate its use.

Example 1

The probability that parents with a certain type of blue-brown eyes will have a child with blue eyes is 1/4. If there are six children in the family, what is the probability that at least half of them will have blue eyes? To solve this problem, the six children in the family will be treated as six independent trials of an experiment for which the probability of success in a single trial is 1/4. Thus n = 6 and p = 1/4 here. It is necessary to calculate P{3}, P{4}, P{5}, and P{6} and sum them because these probabilities correspond to the mutually exclusive ways in which the desired event can occur. By the use of the binomial distribution formula,

The probability of getting at least three successe is obtained by adding these probabilities; consequently, by writing x > 3 to represent at least three successes, we obtain

This result shows that there is a very small chance that a family such as this will have so many blue-eyed children. In only about 17 of 100 such families will at least half of the children be blue-eyed.

 

Example 2

A manufacturer of certain parts for automobiles guarantees that a box of the company's parts will contain at most two defective items. If the box holds 20 parts and experience has shown that the manufacturing process produces 2% defective items, what is the probability that a box of the parts will satisfy the guarantee? This problem can be considered as a binomial distribution problem for which n = 20 and p = .02. A box will satisfy the guarantee if the number of defective parts is 0, 1, or 2. By means of the binomial distribution formula the probabilities of these three events are given by

The calculations here were made with the aid of logarithms. Since these are mutually exclusive events, the probability that there will be at most two defective parts, written x ‰ 2, is the sum of these probabilities; hence, the desired answer is

P{x ‰ 2} = .994

This result shows that the manufacturer's guarantee will almost always be satisfied.

 

Example 3

As a final illustration, consider the following problem concerning whether it pays to guess on an examination. Suppose an examination consists of 10 questions of the multiple-choice type, with each question having five possible answers but only one of the five being the correct answer. If a student receives 3 points for each correct answer and -1 point for each incorrect answer, and if on each of the 10 questions his probability of guessing the correct answer is only 1/3, what is the student's probability of obtaining a positive total score on those 10 questions?

If x denotes the number of questions answered correctly, then a positive score will result if 3x > 10 - x because the left side of this inequality gives the total number of positive points scored and the right side gives the total number of penalty points. This inequality will be satisfied if x > 10/4, which implies that at least three correct answers must be obtained to realize a positive score. The desired probability is therefore given by

Thus the student has an excellent chance of gaining points if his or her probability of guessing a correct answer is as high as 1/3. If the student knew nothing about the material and selected one of the five alternatives by chance, the probability would, of course, be only 1/5 for each question. It is assumed here, however, that the student knows enough about the subject to be able to discard two of the five possibilities as being obviously incorrect and to make a guess regarding the other three. If the student had no such knowledge, so that his or her probability would be 1/5, then similar calculations would show that it would not pay to guess.

 

Категории