
Ch. 6 Resources
Chapter 6: The Standard Deviation as a Ruler and the Normal Model
Chapter 6 contains quite a bit of material; in the examples that follow I'll concentrate on instructions for using the TI-84 to compute percentages associated with a Normal model and creating a Normal probability plot. Note, however, that many of the problems in Chapter 6 can be solved by using the 68-95-99.7 Rule or other methods entirely.
We will NOT use the table methods shown in the textbook to find the percentages associated with the Normal model; the examples in the text explain how to use Table Z in the appendix, for students who do not have a TI-84. But you do have a TI-84 (or TI-83), so you can use it, as shown below. You will not have a copy of Table Z for the exam, so make sure you know how to use your calculator to do these problems.
Standard Normal model
Suppose we want to find the percentage the Normal model with z-scores between 1 and 2. (That is, the percentage of persons/things we would expect to see with data values between 1 and 2 standard deviations above the mean data value.) We enter the following into the TI-84: normalcdf(1,2)
To do this, press 2ND then VARS to get to the DISTR (or distribution) menu. Move the cursor down to normalcdf(:
and press ENTER then type 1 followed by , (a comma) and 2 and then ) to match the opening parenthesis:
Finally press ENTER again and you should see:
which means that about 13.6% of the Normal model should be between 1 and 2 standard deviations above the mean.
You can approximate this answer with the 68-95-99.7 Rule; you should do this to check your understanding of that rule, and be sure to draw a picture:
In fact you should draw a picture for all of the problems of this type regardless of whether or not you use the TI-84 to solve the problem.
If instead we were interested in the percentage with a z-score between -1 and 2:
we would enter normalcdf(-1,2) ≈ 0.8186, or about 81.9%:
Be careful to use the negative sign (-) on the TI-84:
rather than the blue subtraction key when entering negative numbers.
Suppose now we wanted to compute the percentage with a z-score above 1, in other words between 1 and ∞ (infinity):
There is no ∞ button on the TI-84, so we need to use an extremely large number in place of 8. For all practical purposes, the largest number the TI-84 can handle is 1×1099. (This is written in scientific notation, otherwise it would be a 1 with 99 zeros after it!) The calculator notation for this number is 1E99 but to enter it into the calculator you actually have to type 1EE99. Notice that EE is found above the , (comma) key:
so you would press 1 2ND , 99 to enter 1E99. Getting back to the problem at hand, our answer is now given by normalcdf(1,1E99) ≈ 0.1587:
Check that you get approximately the same answer with the 68-95-99.7 Rule.
Similarly, to find the percentage with a z-score below 1.25:
we would get normalcdf(-1E99,1.25) ≈ 0.8944, or about 89%:
If you find yourself using normalcdf over and over again in succession, you can save some time by pressing 2ND ENTER after you get your first answer; the previous expression you typed into the calculator will reappear and you can use the arrow keys and the DEL and INS buttons to modify the expression slightly. Then just press ENTER to evaluate the modified expression.
You should now be in good shape if all of the questions are phrased in terms of z-scores, but they rarely are. Let's take a look at a more realistic example.
IQ scores
IQ tests are often designed to have a mean of µ = 100 and a standard deviation of σ = 15, and to follow the Normal model. Let's use the model N(100,15) to model IQ scores of U.S. adults. What percentage of all U.S. adults have an IQ score between 115 and 125?
First, we convert the IQ scores to z-scores: if IQ = 115, then
`z = frac{115-100}{15} = frac{15}{15} = 1`
If IQ = 125, then
`z = frac{125-100}{15} = frac{25}{15} approx 1.67`
Note: We usually round z-scores to two decimal places, although if you can use a more precise number from a previous calculator computation, by all means do so.
So we've reduced the problem from asking, "What percentage of IQ scores are between 115 and 125?" to asking, "What percentage of z-scores are between 1.00 and 1.67?" The answer: normalcdf(1,1.67) ≈ 0.1112, or about 11%. We could also enter normalcdf(1,5/3) or normalcdf((115-100)/15,(125-100)/15) into the TI-84, both of which yield about 0.1109, essentially the same answer:
Notice what we did in this last example: we converted the given data values to z-scores, then used the calculator to find the desired percentage. You should practice this, as the technique will be necessary in other guises throughout the quarter. However, there is a shortcut available to us: if we can compute the z-scores, the calculator should be able to do so as well, given μ and σ. In fact, it can: if we enter normalcdf(115,125,100,15) we get about 0.1109, or the same answer as before:
Notice that we entered the original left and right data values, then the mean and standard deviation associated with our model.
Working backward
Sometimes we wish to solve this sort of problem in reverse. Suppose instead of asking something like, "What percentage of people/things have a z-score less than 1.36?" we ask a question like, "What z-score cuts off the lowest 10% of data values from the highest 90%?" Here we start out knowing the percentage (or area under the curve) and what we want to find is the corresponding z-score that separates these two regions under the curve:
We do this on the calculator by entering invNorm(0.10); the invNorm command is found right below normalcdf in the DISTR menu. You should get an answer of about -1.28:
which tells us that the data values that are more than 1.28 standard deviations below the mean account for the lowest 10% of all scores in the population for which a Normal model applies. The number we use with the invNorm command is always the area to the LEFT of the desired z-score.
Now let's return to our IQ example from above. Suppose we want to determine which people have IQ scores in the top 15% of the population; in other words we need to know the IQ score that cuts off those in the top 15% from those in the bottom 85%. We compute invNorm(0.85) ≈ 1.04:
so those people with a z-score above z = 1.04 will be in the top 15%. However, we still need to convert this z-score back to an IQ score. Using our z-score formula, we have:
`104 = frac{y-100}{15} Rightarrow (1.04)(15) = y - 100 Rightarrow y = 100 + (1.04)(15) = 115.5`
Thus anyone with an IQ of 116 or above should be in the top 15% of the population:
If we wanted, we could accomplish this more directly by entering invNorm(0.85,100,15) into the TI-84:
Check that you get roughly the same answer as above. Notice that we type the percentage to the left of the desired data value, then the mean and then the standard deviation.
Normal Percentile Plots
Finally, to construct a Normal percentile plot for a data set, follow the instructions for creating a histogram on the TI-84 (in the Chapter 4 Resources), but for the Type icon, select the last one (the one farthest to the right on the second row of icons):
To access it, move the cursor to the histogram icon, then keep pressing the right arrow key. Specify the list with the data you wish to plot and select any type of Mark you like.
Now use ZoomStat to get the Normal percentile plot. Using the 2007 assessed home values (from the houses.txt data set we used in the Chapter 4 and 5 Resources) we get a plot like this:
The plot isn't perfectly straight, and it bends a bit on the far ends, but we might consider it nearly straight, so the Nearly Normal Condition appears to be satisfied. (This shouldn't be too surprising, since in our previous analysis this data appeared to be unimodal and only slightly skewed.)
Data Desk
Data Desk does not offer easy tools to accomplish the same tasks that normalcdf and invNorm do on the TI-84, but these calculations are usually most easily done on the calculator anyway. To construct a Normal percentile plot, click on the variable in question to designate it as Y, then click on Plot and Normal Prob Plot:
Here is the resulting plot for the assessed value variable in Data Desk:
and here is a Normal percentile plot for the taxes variable from the same data set:
Notice how the three outliers (which we saw in the boxplot we created in the the Chapter 5 Resources) show up dramatically in this plot.
If you've already constructed a histogram of the data, you can then click on the variable name in the histogram and select the appropriate option from the pop-up menu:
Homework
Work Exercises 1-11 odd, 15-23 odd, 31, 39-47 odd, 51 and 53. All of these problems are important, but pay particular attention to problems 51–54, as this sort of problem will definitely show up on an exam.
Errata
Although it is not mentioned in the text, the long jump and shot put data set introduced on page 121 is included on the DVD (look for the file Ch06_Heptathlon 2004.txt).
On page 122, the seventh line of the shot put stem-and-leaf display should read 13|012234 (not 13|0122v4).
Although it is not mentioned in the text, the skiing data set introduced on page 123 is included on the DVD (look for the file Ch06_Mens Combined 2006.txt).
On page 123, the second equation in the For Example should read `z_{text(Downhill)} = (101.42-101.807)/(1.8356) approx -0.21` (rather than 87.93).
Although it is not mentioned in the text, the NHANES data set introduced on page 125 is included on the DVD (look for the file Ch06_NHANES.txt).
On page 135, the first line of the second paragraph should read "on the DVD" (not "on the CD-ROM"); likewise, the footnote should read "open the DVD" (not "CD").
The answer in the back of the book for part b of Exercise 19 should read "z = 1.81" (not 21.81).
Part a of Exercise 39 should read "z > 1.5" (not "z > 15") in order to get the answer in the back of the book.
ActivStats
Work the activities on pages 6-1 through 6-4 in the ActivStats lesson book, as time permits, with the exception of "The Normal Table" on page 6-4. Don't worry about the terms "density" and "probability"; we'll discuss those terms when (and if) they become necessary.
Additional Resources
- Normal Distributions
- Episode 4 from Against All Odds features a discussion of the Normal model.
- Normal Calculations
- Episode 5 from Against All Odds shows how to do Normal model computations using a table (you can ignore this) and discusses the Normal percentile plot.
- Sofia: Elementary Statistics
- Lesson 6 of the Sofia Open Content Initiative's Elementary Statistics course includes a discussion of the Normal model (ignore the term "probability" and insert "percentage" in its place and you should be fine).
- Normal model applets
- A pair of Java applets that allow you to do the same sorts of calculations as normalcdf and invNorm on the TI-84, but with the advantage of providing apporpriately shaded pictures of the Normal model so that you don't have to draw them yourself.
- Seeing Statistics
- Another applet that performs the same computations as normalcdf on the TI-84. You may need to maximize the window that pops up when you first access the page.
- Normal Calculations on the TI-84
- Instructions for performing Normal model computations on the TI-83 and TI-84.