Ch. 17 Resources

Chapter 17: Probability Models

In previous chapters we have seen many different probability models. This chapter looks at three special types of models: the geometric model, the binomial model and the Poisson model.

The geometric model is sometimes useful and has the benefit that the associated probability formula is relatively easy to derive and understand.

The binomial model will be the most important of the three, as we will use it a great deal throughout the next part of the course; the associated probability formulas are not quite as easy to derive (and we won't do so in this class) but there are quick shortcuts available on the TI-84. For this reason, you should ignore the binomial probability formula in the book and use the calculator instead.

The Poisson model does have interesting applications, but we will rarely encounter it again in this course; for that reason you can consider the material and exercises about the Poisson model optional and it will not be covered on exams.

Bernoulli trials

A Bernoulli trial consists of situation in which we select one person or thing from a larger group, such as randomly selecting people from the general population for the purpose of a public opinion poll.

With each trial we must ask a yes-or-no question, e.g. "Do you plan to vote for Barack Obama for president in the 2012 presidential election?" Notice that this question can be answered "yes" or "no"; if we had asked "Who are you going to vote for?" the answer might have been "Obama" or "Romney" or "Gingrich" or "Santorum" or "Paul" or "I don't know" or "I don't care" or a whole host of other responses. By knowing the answer to the second question we also know the answer to the first, but for our purposes it is important to phrase the question in a yes-or-no fashion. We often call a "yes" answer a "success" and a "no" answer a "failure"—"even when a "success" might be an affirmative response to a question like "Did the patient come down with the flu?"

In a Bernoulli trial we must also have a fixed probability of success. This is not true with small samples: if we have 20 M&Ms and 5 of them are green, the probability of choosing an M&M and getting a green one is 5/20 on the first trial, but on the second trial the probability of getting a green candy has changed to 4/19 (if we selected a green M&M the first time) or 5/19 (if not); this situation would not constitute a Bernoulli trial.

In the presidential poll example, we are most likely polling people selected from among all voters in the U.S. If 48% of 1,000,000 voters favor Obama, the probability of choosing an Obama supporter on the first trial (480,000/1,000,000 = 0.48) is essentially the same as choosing an Obama supporter on the second trial (479,999/999,999 ≈ 0.48 or 480,000/999,999 ≈ 0.48), so the "fixed probability of success" condition is satisfied in this case, even though the size of the population is finite.

Finally, the trials need to be independent. In the M&M example, the trials are not, since the probability of choosing a green M&M on the second trial depends on the outcome of the first trial. The trials in the Obama example are not truly independent, but as long as we use a simple random sample (calling only voters in Illinois, Obama's home state, or only people who had visited Obama's Web site would violate this condition) and select less than 10% of the population for our sample (which would be true if we contacted 1,000 voters out of a population of 1,000,000, say) then it is reasonable to assume that the trials are independent.

Once we have a Bernoulli trial we can compute probabilities associated with geometric and binomial models.

Geometric model

A geometric model counts the number of trials until the first success. According to the EdCC Web site, 55% of students enrolled at EdCC are female. Suppose we randomly select students from among all of those enrolled at EdCC until we find a female student. This constitutes a Bernoulli trial because:

There are two possible outcomes: female (success) and male (failure).

There is a constant probability of success (55%); this isn't precisely true, but since there are over 13,000 students enrolled at EdCC, removing a few of these students from contention won't significantly change the probability of finding a female student among those remaining.

The trials are independent: since we are randomly selecting the students, and we plan to select less than 10% of all students, the gender of one selected student should have no effect on the gender of the other students selected.

Let the random variable `X` represent the number of students we need to contact until we find a female student. We note that:

`P(X=1) = 0.55`

since the probability that the first student we select is female is 55%. We then compute:

`P(X=2) = P(text(first student is not female and second student is female)) = (0.45)(0.55) = 0.2475`

and then:

`P(X=3) = P(text(first and second students are not female and third student is female)) = (0.45)(0.45)(0.55) approx 0.1114`

and then:

`P(X=4) = P(text(first and second and third students are not female and fourth student is female)) = (0.45)(0.45)(0.45)(0.55) approx 0.0501`

and so on. By this point you should notice a pattern:

`P(X=k) = (0.45)^(k-1)(0.55)`

and more generally:

`P(X=K) = q^(k-1)*p`

where `p` is the probability of success for any geometric model and `q=1-p` is the corresponding probability of failure.

Returning to our EdCC student example, we can collect our probabilities in a table that looks like this:

`x` `P(X=x)`
1 0.5500
2 0.2475
3 0.1114
4 0.0501
`vdots` `vdots`

Notice that this probability model continues forever. Sure, there is only a very small chance that we will need to contact more than 10 EdCC students to find a female student, but theoretically we could contact 20 or 30 or 100 students and not find a female.

The geometric probability formula is also built into the TI-84 (under the DISTR menu); for example

`P(X=4)` = geometpdf(0.55,4) ≈ 0.0501

How do we find the mean and standard deviation of the geometric model? We could use the formulas from Chapter 16, but one problem is that the sums involved for the geometric model are infinite sums, which (unless you've taken three quarters of calculus) you have most likely never encountered before. However it is a fact that the standard formula for the mean reduces to a much simpler formula:

`mu = E(X) = 1/p`

Why is it true that `mu=1/p`? You can find the ugly details in a link below, but the formula itself should be intuitive: If, for example, 20% of all M&Ms (or 1 in 5) are green, then on average we would need to randomly select `mu = 1/p = 1/0.2 = 5` M&Ms before we found a green one. Sometimes, of course, we would find a green one sooner, and sometimes it might take more than 5 trials, but the average would be 5: if we assigned everyone in class to randomly draw M&Ms out of a large batch until they found a green one, there would be a variety of answers, but the average of the number drawn would be about 5.

What is the standard deviation of the geometric model?

`sigma = SD(X) = sqrt(q)/p`

Again, the ugly details can be found elsewhere. While we will sometimes need to compute the mean of a geometric model, we will rarely need to compute the standard deviation.

In our EdCC student example, the mean would be `mu = 1/p = 1/0.55 approx 1.8` (in other words, on average we would need to randomly contact about 1.8 students in order to reach a female).

The geometric probability formula is fairly straightforward, so in most cases the geometpdf feature on the TI-84 is not much easier than using the formula directly; however in certain cases a related feature on the TI-84 can be very useful. Suppose we want to know the probability of finding a female among the first three students we contact. This is given by:

`P(X leq 3) = P(X=1) + P(X=2) + P(X=3) approx 0.55 + 0.2475 + 0.1114 = 0.9089`

but you can get the answer even faster using the TI-84:

`P(X=3)` = geometcdf(0.59,3) = 0.931079

Note that we are using geometcdf here (the c stands for "cumulative"); it gives us the probability of achieving success using at most the specified number of trials.

Binomial model

Now let's consider a different, but related, question involving Bernoulli trials. Suppose we randomly select five EdCC students. What is the probability that none of them are female?

A binomial model is appropriate here: not only is a Bernoulli trial involved, there is a fixed number of trials (in this case, 5). Let `Y` be a random variable representing the number of females in our five-person sample. We then have:

`P(Y=0) = (0.45)^5 approx 0.0185`

What is the probability that exactly one student is female? We could compute the probability that only the first student is female:

`(0.55)(0.45)(0.45)(0.45)(0.45) = (0.55)(0.45)^4 approx 0.0226`

but the one female might be the second student instead:

`(0.45)(0.55)(0.45)(0.45)(0.45) = (0.55)(0.45)^4 approx 0.0226`

or the third or the fourth or the fifth. These five probabilities are all the same, so the probability that exactly one of the five students is female is given by:

`P(Y=1) = 5 times (0.55)(0.45)^4 approx 0.1128`

After this point it gets more complicated. For the probability that exactly two of the five students are female we need to determine the number of ways we can select two of the five students to be female; without too much difficulty we could derive a formula for this, and then derive a formula for the corresponding binomial probability. However, this formula isn't quite as intuitive as the geometric formula (and in practice we'll use the calculator anyway) so let's just jump directly to the calculator:

`P(Y=2)` = binompdf(5,0.55,2) ≈ 0.2757

Let's check our previous answers using the calculator:

`P(Y=0)` = binompdf(5,0.55,0) ≈ 0.0226

`P(Y=1)` = binompdf(5,0.55,1) ≈ 0.1128

and compute the remaining probabilities:

`P(Y=3)` = binompdf(5,0.55,3) ≈ 0.3369

`P(Y=4)` = binompdf(5,0.55,4) ≈ 0.2059

`P(Y=5)` = binompdf(5,0.55,5) ≈ 0.0503

You should check that the sum of these probabilities is 1. There are fairly simple formulas for the mean and standard deviation of a binomial distribution:

`mu = E(Y) = np` and `sigma = SD(Y) = sqrt(npq)`

The gory details of deriving these formulas can be found elsewhere. In our EdCC student example:

`mu = E(Y) = np = 5(0.55) = 2.75`

so, if we repeatedly selected 5 EdCC students at random and counted the number of female students in each 5-student group, we would expect to find an average of about 2.75 females per group, with a standard deviation of:

`sigma = sqrt(npq) = sqrt(5*0.55*0.45) approx 1.11`.

If we want to know the probability that we find no more than two females in our five-student sample, we can add the appropriate individual probabilities found above, or we can use the binomcdf feature on the TI-84:

`P(Y leq 2)` = binomcdf(5,0.55,2) ≈ 0.4069

Note that we are using the binomcdf feature here (where again c stands for "cumulative").

Binomial model revisited

Various polls conducted in late January 2007 reported that about 35% of registered voters supported President Bush's plan to send additional troops to Iraq. Suppose we wanted to ask these voters more detailed questions about the reasons they support the troop surge. We randomly select 1000 voters registered for our follow-up survey.

Let `X` represent the number of voters (out of the 1000 we contact) who approve of the troop surge. Let's check that a binomial model applies to this random variable:

Two outcomes: Either the voter answers "yes, I support the troop surge" or they do not; if they do not answer "yes" they might say "no, I do not support the troop surge" or "I don't know" or "I have no opinion" but we will classify all such responses as failures and we will consider a voter who answers "yes" as a success.

Constant probability of success: We are assuming that 35% of all registered voters support the troop surge. Since we are only sampling 1000 voters out of more than 100,000,000, the probability of success should not substantially change as we remove voters from the general population into our sample.

Independent trials: Since we are selecting the voters at random, their opinions about the troop surge should be independent of one another.

Suppose we want to get at least 400 voters who support the troop surge for our follow-up survey. What is the probability that our sample of 1000 randomly selected voters will contain at least 400 people who support the troop surge?

Since a binomial model applies, we can answer use the methods outlined above to find the answer. Note that:

`P(X geq 400)` = 1−binomcdf(1000,0.35,399) ≈ 0.00057 = 0.057%

so it is very unlikely that we will get enough voters in our sample who support the troop surge. We should plan to contact more than 1000 voters for our follow-up survey.

We've answered the question at hand, but there's another way to solve this problem that will be useful in future applications, so we employ it now to show that it gives (approximately) the same answer as our first method.

When `np geq 10` and `nq geq 10` we can approximate the binomial model with a Normal model. (Now is a good time to revisit the Normal model computations from Chapter 6 if necessary; to see why this condition is necessary, see the discussion on page 440.)

In our present example we have `p = 0.35` so:

`np = 1000*0.35 = 350 geq 10`

and:

`nq = 1000*0.65 = 650 geq 10`

so this condition is certainly satisfied.

Of course, in order to use a Normal model we need to specify the mean and standard deviation. For our present example, the mean is:

`mu = np = 1000*0.35 =350`

and the standard deviation is:

`sigma = sqrt(npq) = sqrt(1000*0.35*0.65) approx 15.08`.

Thus we want to use the Normal model N(350,15.08). To compute `P(X geq 400)` we compute:

normalcdf(400,1E99,350,15.08) ≈ 0.00046

Notice that this isn't equal to the exact answer we got with 1−binomcdf(1000,0.35,399), but it's fairly close (in fact, close enough to reach the same conclusion that we will need to survey more than 1000 voters).

We can sometimes get a more accurate answer if we realize that our random variable `X` is discrete: it might equal 398, or 399, or 400, or 401, but it can't equal 399.8; thus, any number between 399.5 and 400.5 would be interpreted as being equal to 400. So we adjust our computation slightly and use:

`P(X geq 400)`normalcdf(399.5,1E99,350,15.08) ≈ 0.00051,

which is closer (although still not equal to the exact answer). This adjustment is called a continuity correction; while you're welcome to use it, our Normal model approximations will usually be good enough without it. Just keep in mind that the Normal model only gives us an approximation to the binomial model.

Homework

Work the following exercise in Chapter 17: 1, 3, 5, 15, 19, 21, 23, 27–33 odd and 41.

Errata

The binomial probability formula on page 437 (which you should ignore-use the TI-84 instead) is incorrect. It should read:

`P(X=x) = text( )_nC_x p^x q^(n-x)`

where

`text( )_nC_x = (n!)/(x! (n-x)!)`

ActivStats

Work through the lessons on page 17-1 in the ActivStats lesson book, as time permits.

Additional Resources

Binomial Probability Histogram
This applet from the University of California at Berkeley Statistics Department offers useful insight into how the Normal model approximates the binomial model.
Binomial Distributions
Episode 17 from Against All Odds features a discussion of binomial probabilities, although some of the terminology may be different.
Sofia: Binomial Probability
Lesson 4.3 of the Sofia Open Content Initiative's Elementary Statistics course include a discussion of the binomial model and Lesson 4.4 discusses the Poisson model.