Geometric Probability Models

So far we have looked at several different probability models. We will now begin looking at some very special families of probability models with important applications. The first of these (the geometric model) is sometimes useful and has the benefit that the associated probability formula is relatively easy to derive and understand.

Bernoulli trials
A Bernoulli trial consists of situation in which we select one person or thing from a larger group, such as randomly selecting people from the general population for the purpose of a public opinion poll, where the following three conditions are satisfied:

With each trial we must ask a yes-or-no question, e.g. "Do you plan to vote for Barack Obama for president in the 2012 presidential election?" Notice that this question can be answered "yes" or "no"; if we had asked "Who are you going to vote for?" the answer might have been "Obama" or "Romney" or "Johnson" or "Stein" or "I don't know" or "I don't care" or a whole host of other responses. By knowing the answer to the second question we also know the answer to the first, but for our purposes it is important to phrase the question in a yes-or-no fashion. We often call a "yes" answer a "success" and a "no" answer a "failure." (Even when a "success" might be an affirmative response to a question like "Did the patient contract a fatal disease?")

In a Bernoulli trial we must also have a fixed probability of success. This is not true with small samples: if we have 20 M&Ms and 5 of them are green, the probability of choosing an M&M and getting a green one is 5/20 on the first trial, but on the second trial the probability of getting a green candy has changed to 4/19 (if we selected a green M&M the first time) or 5/19 (if not); this situation would not constitute a Bernoulli trial. In the presidential poll example, we are most likely polling people selected from among all voters in the U.S. If 48% of 100,000,000 voters favor Obama, the probability of choosing an Obama supporter on the first trial (48,000,000/100,000,000 = 0.48) is essentially the same as choosing an Obama supporter on the second trial (47,999,999/99,999,999 ≈ 0.48 or 48,000,000/99,999,999 ≈ 0.48), so the "fixed probability of success" condition is satisfied in this case, even though the size of the population is finite.

Finally, the trials need to be independent. In the M&M example, the trials are not, since the probability of choosing a green M&M on the second trial depends on the outcome of the first trial. The trials in the Obama example are not truly independent, but as long as we use a simple random sample (calling only voters in Illinois, Obama's home state, or people who had visited Obama's Web site would violate this condition) and select a relatively small sample from the population (which would be true if we contacted 1,000 voters out of a population of 100,000,000, say) then it is reasonable to assume that the trials are independent.

Many important probability models involve Bernoulli trials.

Geometric model
A geometric model counts the number of trials until we find the first success. According to the EdCC Web site, 55% of students enrolled at EdCC are female. Suppose we randomly select students from among all of those enrolled at EdCC until we find a female student. This constitutes a Bernoulli trial because:

There are two possible outcomes: female (success) and male (failure).

There is a constant probability of success (55%); this isn't precisely true, but since there are over 10,000 students enrolled at EdCC, removing a few of these students from contention won't significantly change the probability of finding a female student among those remaining.

The trials are independent: since we are randomly selecting the students, and we plan to select a relatively small sample (less than 1%) from the population of all students, the gender of one selected student should have no effect on the gender of the other students selected.

Let the random variable X represent the number of students we need to contact until we find a female student. We note that:

P(X = 1) = 0.55

since the probability that the first student we select is female is 55%. We then compute:

P(X = 2) = 0.45×0.55 = 0.2475

because we need the first student to be male and the second to be female. We then compute:

P(X = 3) = 0.45×0.45×0.55 = (0.45)2×0.55 ≈ 0.1114

and then:

P(X = 4) = 0.45×0.45×0.45×0.55 = (0.45)3×0.55 ≈ 0.0501

and so on. By this point you should notice a pattern:

P(X = k) = (0.45)k-1×0.55

and more generally:

P(X = k) = qk-1×p

where p is the probability of success for a particular geometric model q = 1−p is the corresponding probability of failure.

Returning to our EdCC student example, we can collect our probabilities in a table that looks like this:

 k   P(X=k)
1 0.5500
2 0.2475
3 0.1114
4 0.0501
`vdots` `vdots`

Notice that this probability model continues forever. Sure, there is only a very small chance that we will need to contact more than, say, 10 EdCC students to find a female student, but theoretically we could contact 20 or 30 or 100 students and not find a female.

The geometric probability formula is also built into the TI-84 (under the DISTR menu):

Find geometpdf( then press ENTER and type 0.55 followed by a , and then a 4 and then a ) and press ENTER to compute:

P(X = 4) = geometpdf(0.55,4) ≈ 0.0501

where in general we use geometpdf(p,k) to compute P(X = k).

We don't really need the calculator for this (because the formula is so simple) but some related calculator features will be useful quite soon.

Expected value
How do we find the mean (expected value) of a geometric probability model? We could use the `sum k cdot P(X=k)`  formula we used previously, but one problem is that the sum involved for the geometric model is an infinite sum, which (unless you've taken three quarters of calculus) you may have never encountered before. However, it is a fact that the standard formula for expected value reduces to a much simpler formula:

`mu = E(X) = 1/p`

In the case of the EdCC student survey, this formula tells us we should expect to contact an average of `1/0.55 approx 1.8`  students before finding a female student. While we can't actually contact 1.8 students in any individual survey, if we asked every student in a statistics class to conduct a survey of EdCC students, we might expect that some of them would find a female student on their first try, others would require two tries, some would need three, and so on. But the average number of attempts for the entire group of statistics students should be around 1.8.

Why does is formula true? Intuitively, it should make sense: if 20% of the population (or 1 in 5) corresponds to a success, it's reasonable to expect that, on average, we would need to contact 5 people before finding a success, and 1/0.2 = 5. To prove this mathematically is more complicated. The details are below the exercises if you are interested.

What is the standard deviation of the geometric model? The formula is `SD(X) = (sqrt(q))/(p)`  but we don't often use the standard deviation of a geometric model.

Cumulative probability
Sometimes we might want to know the probability of finding a female among, say, the first three students we contact. This is given by:

P(≤ 3) = P(X = 1 or X = 2 or X = 3) = P(X = 1) + P(X = 2) + P(X = 3) = 0.5500 + 0.2475 + 0.1114 = 0.9089

but there's a faster way to compute this. The alternative to finding a female among the first three students is not finding any females among the first three students (in other words, they're all male), so:

P(≤ 3) = 1 − 0.45×0.45×0.45 = 1 − 0.453 = 1 − 0.091125 = 0.908875

In general:

P(X ≤ 3) = 1 − qk 

but you can get the answer using the TI-84:

P(X ≤ 3 = geometcdf(0.55,3) = 0.90875

Note that we are using geometcdf here (the c stands for "cumulative"); it gives us the probability of achieving a success using at most the specified number of trials.

Exercises

1. Explain why each of the following situations does (or does not) represent a Bernoulli trial.

a) Randomly select 1,000 registered voters from throughout the United States and ask whether they plan to vote in person, vote by mail or not vote at all in the upcoming election.

b) Randomly select 2,000 people from a list of approximately 16,000 Edmonds residents who voted in the 2011 general election to ask whether or not they plan to participate in the 2012 general election.

c) Randomly select 100 people from a list of approximately 16,000 Edmonds residents who voted in the 2011 general election to ask whether or not they plan to participate in the 2012 general election.

d) Ask your Facebook friends if they plan to vote for Obama or Romney.

e) Shuffle a deck of cards, draw one card and record whether or not it is an Ace; replace the card, reshuffle, and repeat this process.

2. According to the Web site of the Snohomish County Auditor, approximately 63.5% of Edmonds voters in the 2011 general election cast their ballot for the winning mayoral candidate. (The others cast their vote for his opponent, wrote in a name, or declined to vote for mayor.) A supporter of the mayor randomly selects voters from a list of people who had participated in the 2011 general election and called people on this list until she reaches a person who did not vote for the mayor (to ask what they think about how the mayor is carrying out his duties).

a) Compute the probability that:

i) she only need to call one person.

ii) she needs to call exactly four people.

iii) she needs to call at most four people.

iv) she needs to call at least four people.

v) she needs to call three or four people.

b) On average, how many voters should she expect to call before finding someone who fits this description?

 

Expected value proof

`E(X) = sum_{k=1}^{oo} k cdot q^{k-1} p = p + 2qp + 3q^2 p + 4 q^3 p + cdots \ = \ p[1+2q+3q^2+4q^3+ cdots] `

`\ \ \ \ \ \ \ \ = p[(1+q + q^2 + q^3 + cdots) + (q+q^2+q^3+cdots) + (q^2+q^3+cdots) + cdots] = p[(1+q+q^2+q^3 +cdots) + q(1+q+q^2+q^3+cdots) + q^2(1+q+q^2+q^3 + cdots) + cdots]`

` \ \ \ \ \ \ \ \ \ = p(1+q+q^2+q^3+cdots)[1+q+q^2+q^3+cdots] = p cdot (1)/(1-q) cdot (1)/(1-q) = p cdot 1/p cdot 1/p = 1/p`

Here we have used a fact about the sum of a geometric series:

`1 + x+x^2+x^3 + cdots = 1/(1-x)`