Review Problems: Week 2

1. Political polling The polling firm Research 2000 interviewed 2,400 adults during the period September 14–17, 2009, on behalf of the Web site Daily Kos. The firm made a cross-section of calls into each state in the country in order to reflect the adult population nationally. Poll respondents were asked many different questions relating to politics in the United States. Answers to two of these questions ("Do you have a favorable or unfavorable opinion of President Barack Obama?" and "What region of the country do you live in?") are summarized in the following table:

          Favorable  Unfavorable  No Opinion
Northeast       413           51          40
South           194          482          43
Midwest         403          201          46
West            311          179          37

a) How many cases are included in this data set?

b) How many variables are included in this data set?

c) Which of these might be considered an explanatory variable?

d) Which of these might be considered a response variable?

e) What percentage of respondents have a favorable opinion of Pres. Obama?

f) What percentage of respondents are southerners?

g) What percentage of southerners have a favorable opinion of Pres. Obama?

h) What percentage of respondents with a favorable opinion of Pres. Obama are southerners?

i) What percentage of respondents are southerners with a favorable opinion of Pres. Obama?

j) Create an appropriate graphical display of this data.

k) Does opinion about the president appear to be independent of region? Use a complete
sentence or two to explain your answer.

2. Real Estate Craigslist is a Web site that allows users to post online classified advertisements at no charge (with the exception of job postings in certain metropolitan areas). The table below shows the street address, size (in square feet), asking price (in dollars) and number of bedrooms (abbreviated BR) for nine houses located in the city of Lynnwood listed for sale on Craigslist on October 9, 2011.

address            size  price   BR
18712 57th Ave W   1805  349950  3
19014 24th Ave W   2404  329900  5
3203A 204th St SW  1912  250000  4
17112 6th Ave W    3200  509797  4
19631 9th Pl W     2369  339950  4
17402 62nd Ave W   1200  185000  3
21011 54th Ave W   1660  136950  3
14018 20th Pl W    2244  254950  4
14517 40 Ave W     2450  335000  3

a) What relationship would you expect to see between the size of a house and its price?

b) Which of these might be considered an explanatory variable?

c) Which of these might be considered a response variable?

d) Create an appropriate graphical display to investigate.

e) Describe the relationship you see in the display.

f) Are there any outliers? If so, identify them.

g) Is your answer to part c consistent with what you answered in part a?

3. Exercise A statistics student wondered whether males spend more time exercising on a treadmill at the gym than females. She went to several different gyms on a single day in order to observe 20 males (denoted "m" below) and 20 females ("f") using a treadmill and recorded the time (in minutes) on the display when they finished running.

52 f   35 f   56 m   25 f
22 m   15 f   41 m   10 f
29 m   38 m    5 m   25 f
55 f   21 m   13 m   48 m
20 f   33 m   48 f   31 m
18 f   10 m   52 m   10 f
 5 f   63 m   60 m   60 f
33 m    5 f   40 m   25 f
33 f   26 m   30 m    5 f
30 m   40 f   14 f   35 f

a) How many variables are included in this data set?

b) Which of these might be considered an explanatory variable?

c) Which of these might be considered a response variable?

d) Create a back-to-back stem-and-leaf display to compare male and female running times.

e) Describe any interesting similarities or differences between the distributions of running times for males and females.

f) Create another appropriate graphical display to compare male and female running times.

g) Is there anything apparent in your display from part that was not apparent in the back-to-back stem-and-leaf display?

h) Was there anything apparent apparent in the back-to-back stem-and-leaf display that was not apparent in your display from part f?

4. Toyota Prius The Prius is a popular gas-electric hybrid car manufactured by Toyota. The table below shows the VIN (vehicle identification number), color, age (in years), mileage (in miles) and asking price (in dollars) for 14 used Toyota Prius automobiles advertised for sale on the Web site of The Seattle Times on January 23, 2011.

VIN               age  color  mileage  price
JTDKN3DU9A0056349   1  black     9277  28995
JTDKN3DU8A0165157   1  black     4180  28995
JTDKN3DU1A0057303   1  blue     32105  25995
JTDKN3DU9A0147198   1  gray      8129  24995
JTDKB20U197821193   2  pewter   28434  23995
JTDKB20U683348798   3  green    40762  22995
JTDKB20U187716331   3  green    24531  22995
JTDKB20U387727363   3  blue     16262  22995
JTDKN3DU8A0059050   1  silver   32830  21995
JTDKB20U383347267   3  gray     32604  21995
JTDKB20U583417996   3  gray     43827  18995
JTDKB20U697840628   2  white    24632  20995
JTDKB20U297880205   2  white    33651  18680

a) Construct an appropriate graphical display to investigate a possible association between mileage and price for these cars.

b) Describe the association (or lack of association) between mileage and price.

c) Which of these would be most reasonably considered an explanatory variable?

d) Which of these would be most reasonably considered a response variable?

e) Construct an appropriate graphical display to investigate a possible association between price and age for these cars.

f) Describe the association (or lack of association) between price and age.

g) Which of these would be most reasonably considered an explanatory variable?

h) Which of these would be most reasonably considered a response variable?

i) Construct an appropriate graphical display to investigate a possible association between mileage and age for these cars.

j) Describe the association (or lack of association) between mileage and age.

k) Which of these would be most reasonably considered an explanatory variable?

l) Which of these would be most reasonably considered a response variable?

m) In addition to the 14 automobiles in the data set provided above, The Seattle Times also listed two significantly older Toyota Prius automobiles:

VIN               age  color  mileage  price
JT2BK12U630070267   8  green    83996  10995
JT2BK18U720060613   9   blue   110919   8995

If we included these two cars in the graphical display for mileage vs. price, what term(s) would best describe these cars?

5. Caucuses Between November 30 and December 5, 2011, a New York Times/CBS News poll contacted 1,869 registered voters across the state of Iowa; 642 respondents said they would definitely or probably attend the Republican caucuses in January 2012. The survey asked each of the 642 likely Iowa caucus-goers who they planned to support for the Republican presidential nomination. The following table summarizes the responses of these 642 Iowans, along with results from a similar poll conducted in South Carolina by Winthrop University, which contacted 332 Republicans who will planned to participate in that state's presidential primary election the following month.

                  IA   SC
Newt Gingrich    199  127
Mitt Romney      111   71
Ron Paul         103   14
other/undecided  229  120

a) What's different about the way this data was collected compared to the data from the survey in Exercise #1 above?

b) Construct an appropriate graphical display for this data.

c) What percentage of Iowa voters surveyed supported Newt Gingrich?

d) What percentage of voters surveyed from both states supported Newt Gingrich?

e) Which state had the highest percentage of undecided voters? Explain.

f) Was support for these Republican presidential candidates similar in the two states or significantly different? Explain.

6. Fastballs Major League Baseball now records information about every pitch thrown in every game of every season. Statistician Jim Albert compiled data about every pitch thrown by 20 starting pitchers during the 2009 MLB season. The data set included the type of pitch thrown (curveball, changeup, slider, etc.) as well as the speed of the ball as it left the pitcher's hand. A histogram of speeds for all 30,740 four-seam fastballs thrown by these pitchers during the 2009 season is shown below, along with a Normal probability plot. The mean speed was 92.12 mph, with a standard deviation of 2.43 mph.

a) Looking at the histogram, do you think a Normal model applies to this data?

b) What does the Normal probability plot tell you about the appropriateness of a Normal model for this data?

c) Assuming that a Normal model does apply, compute:

i) the z-score of a four-seam fastball travelling 98 mph.

ii) the z-score of a four-seam fastball travelling 88 mph.

iii) the percentage of four-seam fastballs travelling between 90 and 95 mph.

iv) the percentage of four-seam fastballs travelling in excess of 93 mph.

v) the percentage of four-seam fastballs travelling slower than 89 mph.

vi) the speed that separates the fastest 10% of these fastballs from the slowest 90%.

7. Exit polls To gather information about voting patterns in the 2008 U.S. presidential election, a group of newspapers and television networks hired Edison Media Research to conduct exit polls by surveying voters at polling locations nationwide on Election Day (November 4, 2008) in addition to conducting telephone interviews with voters who cast their ballot via absentee ballots or early voting. Among those surveyed were 1,241 voters from Washington State. The survey results included the candidate for whom each voter cast his or her ballot (Barack Obama, John McCain or someone else) and the voter's gender (male or female). The results of these two questions for the Washington voters appear in the table below.

       Obama  McCain other
male     340     238    18
female   374     250    21

a) How many cases are included in this data set?

b) How many variables are included in this data set?

c) Which of these might be considered an explanatory variable?

d) Which of these might be considered a response variable?

e) What percentage of respondents voted for Obama?

f) What percentage of respondents are female?

g) What percentage of females voted for Obama?

h) What percentage of Obama voters were female?

i) What percentage of people surveyed were female Obama voters?

j) Create an appropriate graphical display of this data.

k) Does there appear to have been a gender gap in Washington state during the 2008 presidential election? In other words, does gender appear to be independent of candidate? Use a complete sentence or two to explain your answer.

8. More Priuses On October 17, 2007, the classified ads on the Web site of The Seattle Times listed the following 13 used Toyota Prius automobiles for sale; the data set below shows the year, color, mileage (in miles) and asking price (in U.S dollars) for each car.

year  color  mileage  price
2006  green    17043  25995
2007  gray     12628  24980
2005  maroon   24039  24885
2005  silver   48226  23995
2006  black    10522  22995
2004  silver   66345  21995
2007  white     5611  21995
2005  gold     24479  21595
2004  white    14618  20995
2005  silver   53699  20980
2001  unknown 171700   8300
2004  silver   47649  17995
2003  white    39600  17500
2005  black   103126  16995

a) Construct an appropriate graphical display to investigate a possible relationship between mileage and price.

b) In your graph, what was the explanatory variable?

c) In your graph, what was the response variable?

d) Describe any association you see and note any unusual features.

9. Washington Basic Health Plan A study (cited in "The Importance of the Normality Assumption in Large Public Health Data Sets" by Thomas Lumley, Paula Diehr, Scott Emerson and Lu Chen of the University of Washington Biostatistics Department, published in Annu. Rev. Public Health 2002, 23:151–69) that gathered data about 6,918 subjects enrolled in Washington's Basic Health Plan found a mean annualized medical cost of $389 with a standard deviation of $895.

a) Based only on the information provided above, does it appear that a Normal model applies to this data set? Explain.

b) Assuming that a Normal model does apply to the data, estimate the percentage of plan participants with annualized medical costs above $400.

A graphical display of the actual data for these 6,918 subjects appears below:

c) What type of graphical display is this?

d) Use the graph to estimate the percentage of plan participants with annualized medical costs above $400.

e) Do your answers to b) and d) agree? Explain any discrepancy.

f) Based on the graph, does it appear that a Normal model applies to the data?

10. Side effects Spiriva is a drug designed to help people with diseases such as chronic bronchitis and emphysema. Clincal trials of Spiriva were conducted to determine possible side effects. Of 550 people randomly assigned to use Spiriva for one year, 226 developed an upper respiratory tract infection (URI). Of 371 people randomly assigned to use a placebo for one year, 137 developed a URI.

a) Based on the information provided above, how many variables are included in this data set?

b) Which of the variables might best be considered an explanatory variable?

c) Which of the variables might best be considered a response variable?

d) Create an appropriate graphical display of this data.

e) Based on the graph you created, does it appear that people who take Spiriva are more likely to develop a URI than people who take placebo?

11. Inkjet printers For their May 2005 issue, the editors of Consumer Reports compared the cost and effectiveness of a variety of inkjet printers. The following table lists the model, retail price (in dollars) and the text speed (in pages per minute, or ppm) for the 13 top-ranked models.

model                             price  speed
HP Deskjet 6540                     130   11.0
Canon Pixma iP4000                  140   10.0
HP PhotoSmart 7760                  150    6.0
HP Deskjet 5850                     235    6.0
HP PhotoSmart 7960                  230    6.0
HP PhotoSmart 8450                  245    7.0
Canon Pixma iP5000                  190    9.0
Canon Pixma iP2000                   80  10.0
Canon Pixma iP8500                  345    4.5
HP Deskjet 6127                     250    7.0
Lexmark P915 Photo                  135    9.0
Epson Stylus Photo R800             375    2.5
Lexmark Color Jetprinter Z816        90    9.5

a) How many cases are included in this data set?

b) How many variables are included in this data set?

c) We wish to investigate a possible association between the price and speed of these printers. Which would you consider to be an explanatory variable and which would you consider to be a response variable? Explain.

d) If a classmate gave a different answer to part c, could they also be correct? Explain.

e) Create an appropriate graphical display of this data.

f) Describe any association you notice in the graphical display.

g) Is there anything surprising about this association?

12. Broadband use A survey by the Pew Internet and American Life Project asked a random sample of adults about use of the Internet. One question asked whether the subject had a broadband Internet connection at home. Another question asked about the type of community in which the subject lived (urban, suburban or rural). The results of these two questions are summarized in the following table:

     Urban  Suburban  Rural
Yes    300        521   174
No     276        542   387

a) How many cases are included in this data set?

b) How many variables are included in this data set?

c) Which of these variables would best be considered the explanatory variable?

d) Which of these variables would best be considered the response variable?

e) Create an appropriate graphical display of this data.

f) Describe any association you notice in the graphical display.

g) What can you conclude about a possible association between these variables?

13. Paper airplanes For a class project, a student wanted to determine whether a paper airplane flew further, on average, with a paperclip attached to it. She used 30 similar pieces of papers, and folded each of them in the same manner, then attached a paperclip to the nose of 15 of them. On March 14, 2009, she flew each plane once, for a total of 30 flights. The following table shows the distance for each flight (in inches):

clip   no clip
 134       109
 192       112
 122       128
 194        81
 122        83
 228       139
 180       133
 155       150
 194       154
 210       135
 194       125
 196        81
 185       135
 200       145
 183       144

a) How many cases are included in this data set?

b) How many variables are included in this data set?

c) Which of these variables would best be considered the explanatory variable?

d) Which of these variables would best be considered the response variable?

e) Create an appropriate graphical display of this data.

f) Describe any association you notice in the graphical display.

g) What can you conclude about a possible association between these variables?

14. Flu study In a study conducted during the 2007–2008 flu season ("Comparative Efficacy of Inactivated and Live Attenuated Influenza Vaccines," Arnold S. Monto, et al., New England Journal of Medicine, September 24 2009, 361:13, 1260–1267), 1,952 healthy adults were randomly assigned to receive a flu shot, FluMist (a nasal vaccine) or a placebo. Of the 813 volunteers receiving the flu shot, 28 later developed a confirmed case of influenza; in the FluMist group, 56 of the 814 volunteers developed the flu; and among the 325 patients who received a placebo, 35 developed the flu.

a) How many cases are included in this data set?

b) How many variables are included in this data set?

c) Which of these variables would best be considered the explanatory variable?

d) Which of these variables would best be considered the response variable?

e) Create an appropriate graphical display of this data.

f) Describe any association you notice in the graphical display.

g) What can you conclude about a possible association between these variables?

15. Another case of the flu Determining the actual proportion of all adults who contract the flu during a given flu season is quite difficult, but a Harris Interactive poll suggested that about 20% of all adults in the U.S. contracted the flu during the 2007–2008 flu season. We wish to investigate whether a healthy adult receiving a flu vaccine is less likely to contract the flu than a member of the general population. To do this, use the results of the study mentioned in the previous problem and assume that exactly 20% of all adults contracted the flu during the 2007–2008 flu season.

a) Create a graphical display comparing the results of the study to the assumption about the general population.

b) Is there evidence that people who get a flu vaccine are less likely to contract the flue than a member of the general population?

c) What about people who receive FluMist?

d) What about people who receive a placebo?

16. Left-hand turns On December 2, 2009, a statistics student observed traffic near the corner of 148th and Manor Way in Lynnwood, to investigate whether there is a difference in the duration of time taken to make a left-hand turn between male and female drivers. For 30 drivers (15 male and 15 female), she used a stopwatch to record the gender of each driver and the length of time used to make a left-hand turn. The data she recorded appears below:

male     4.0  3.4  4.3  3.6  2.7  3.8  4.3  3.5  5.3  4.6  3.9  5.1  5.0  4.1  4.8
female   4.7  5.3  5.4  4.1  4.7  3.8  3.1  4.2  4.1  2.9  4.1  4.2  3.7  4.0  4.0

a) How many cases are included in this data set?

b) How many variables are included in this data set?

c) Which of these variables would best be considered the explanatory variable?

d) Which of these variables would best be considered the response variable?

e) Create an appropriate graphical display of this data.

f) Describe any association you notice in the graphical display.

g) What can you conclude about a possible difference in the duration of time taken to make a left-hand turn between male and female drivers?

17. Nerf guns For a class project, a statistics student tested his theory about the regulators found on Nerf guns: that they slow the muzzle velocity of the darts. He collected three Nerf guns that had 22 barrels among them, each barrel individually regulated. He fired a Nerf dart once using each barrel and measured how many inches it traveled, using the same dart on all tests. He then removed the regulators and fired one shot with each of the barrels again. The data he recorded appears below:

barrel  regulator  no regulator
     1        231           252
     2        208           245
     3        202           251
     4        212           265
     5        193           210
     6        201           234
     7        125           155
     8        168           141
     9         38            74
    10        154           231
    11        122           103
    12         77           123
    13        243           262
    14        215           252
    15        239           221
    16        234           268
    17        232           245
    18        237           252
    19        230           254
    20        245           259
    21        218           249
    22        246           262

a) How many variables are included in this data set? (Specify the variables and classify each.)

b) How many cases are included in this data set?

c) Which type of display would be more useful to help investigate the student's theory: a back-to-back stem-and-leaf display comparing the regulator distances with the no regulator distances, or a single stem-and-leaf display of the differences between the two distances for each barrel? Explain.

d) Create that graphical display.

e) Describe what you see in the display and explain what it tells you about the student's theory.

18. Nerf II Refer to the data set from the previous problem.

a) Create scatterplot to investigate a possible association between the regulator distances and the no regulator distances.

b) Describe any association you see in the scatterplot.

c) What does this tell you about the Nerf gun regulators?

d) Does this plot help confirm or refute the student's theory? Explain.

19. Almond vs. dark chocolate A statistics student wondered whether the color distribution for Almond M&M's candies is the same as for Dark Chocolate Peanut M&M's. She collected several packages of each type and recorded the color of each candy, as summarized in the following table:

           red orange yellow blue green brown
Almond      95     95     91  100    87    86
Dark Peanut 83     94     92  109   102    78

a) How many variables are included in this data set?

b) How many cases are included in this data set?

c) Create an appropriate graphical display for this data.

d) Describe an association you notice in the display.

e) Does this help answer the the question the student set out to investigate? Explain.

20. Liver disease Between 1974 and 1984, the Mayo Clinic collected information about patients with the liver disease primary biliary cirrhosis (PBC). A study of 216 of these patients with this disease found a mean serum albumin level of 34.46 g/l with a standard deviation of 5.84 g/l. A graphical display of these levels appears below:

a) Do you think it is appropriate to use a Normal model for the serum albumin levels of PBC patients? Explain your answer using a complete sentence or two.

Now assume that a Normal model is appropriate, regardless of your answer to part a).

b) What percentage of PBC patients have a serum albumin level between 35 g/l and 42 g/l?

c) What percentage of PBC patients have a serum albumin level below 35 g/l?

d) What percentage of PBC patients have a serum albumin level above 42 g/l?

e) We wish to identify the PBC patients whose serum albumin levels are among the highest 10% of all such patients. What level separates these patients from the other 90%?

21. Diet Coke and Mentos When Mentos candy is dropped into a bottle of Diet Coke, a reaction occurs, resulting in a geyser of Diet Coke shooting several feet into the air. A statistics student wondered if the temperature of the Diet Coke would affect the height of the Diet Coke geyser. She used 12 2-liter bottles of Diet Coke, 6 at room temperature and the 6 other refrigerated at 40°F. The height (in inches) of the resulting geyser for each of these 12 bottles appears in the table below.

room temp  refrigerated
       68            55
       73            52
       70            55
       73            53
       69            51
       70            53

a) How many cases are included in this data set?

b) How many variables are included in this data set? Specify them and classify the type of each.

c) Create an appropriate graphical display for this data.

d) Does the display help answer the student's question? Explain.

22. Woodway real estate The Northwest Multiple Listing Service (NWMLS) operates a database of homes and property for sale throughout Washington State and provides this information to realtors and real estate Web sites. The table below includes the street address, year built, number of bedrooms (BR), number of bathrooms (BA), size (in square feet) and asking price (in thousands of dollars) for the nine houses located in the city of Woodway listed for sale on NWMLS on October 16, 2012.

address                year BR BA  size price
24323 Timber Lane      1923  4  3  3000   729
22109 Woodway Park Rd  1950  4  2  2430   769
23407 Woodway Park Rd  1921  3  2  4625   995
11402 239th Pl SW      1962  5  3  2868   800
23503 Timber Lane      1973  4  3  4231  1100
23920 115th Pl W       2000  4  5  4577  1025
11312 S Dogwood Lane   1940  5  6  6527  1650
22714 106th Ave W      2003  6  7  7746  1750
24120 114th Ave W      1964  5  3  3468   800

a) How many cases are included in this data set?

b) How many variables are included in this data set?

c) We wish to investigate a possible association between the size and price of these houses. Which would you consider to be an explanatory variable and which would you consider to be a response variable? Explain.

d) Create an appropriate graphical display to investigate a possible association between size and price.

e) Describe any association you notice in the graphical display. Be sure to mention any unusual features.

23. Laundry detergent An article in the November 2011 issue of Consumer Reports compared the price (in cents per load) and performance (on a scale from 0 to 100) of 34 brands of high-efficiency laundry detergent (used in front-loading machines) and 24 brands of conventional laundry detergent (used in top-loading machines). A graphical display of the price data for all 58 detergents comparing conventional (cnl) to high-efficiency (HE) appears below.

a) What type of display appears above?

b) How many cases are included in this data set?

c) How many variables are included in this data set?

d) Which type of detergent has a higher median price?

e) Which type of detergent has prices with the smaller IQR?

f) Which type of detergent has prices with the largest range?

24. Subaru Outback The Web site cars.com contains listings for used cars from throughout the United States. The table below contains information for the 16 Subaru Outback automobiles listed for sale under $15,000 within 10 miles of Lynnwood, Washington, as of October 29, 2012. This information contains the Vehicle Identication Number (VIN), color, model year, mileage (in miles) and price (in dollars).

VIN                color         year mileage  price
4S4BP67C264323292  red           2006   96756  14995
4S4BP61C976312854  silver        2007   91920  13991
4S4BP61C657387115  white         2005  129118  11991
4S4BP61CX67355012  silver        2006   92355  10995
4S3BH686747631189  white         2004  106000  10988
4S4BP62C257301801  silver        2005  150596   9995
4S3BH675537622627  silver        2003  107779   9991
4S3BH686227664422  black cherry  2002  102603   9400
4S3BH806627663958  white         2002  140937   8995
4S3BH6865Y6672979  white         2000  144088   8787
4S3BE686147200243  green         2004  173993   8398
4S3BH6651Y6659124  green         2000  126872   7999
4S3BH665826621154  red           2002  142082   7998
4S3BH806827621324  white         2002  107627   7991
4S3BH675327658539  burgundy      2002  115234   7990
4S3BH686417636362  black         2001  142432   6900

a) How many cases are included in this data set?

b) How many variables are included in this data set?

c) What type of variable is the VIN?

d) What type of variable is color?

e) Create an appropriate graphical display to investigate a possible association between mileage and price.

f) Which variable did you select as the explanatory variable when creating your graphical display?

g) Which variable did you select as the response variable?

h) Describe any association you see in the display. Be sure to note any unusual features.

i) The Web site actually listed one more Subaru Outback for sale in Lynnwood:

VIN                color        year mileage  price
4S3BH6656Y7635735  red          2000   42712   9995

If you included this car in the graphical display, what term would apply to this car?

25. Unions The Pew Research Center for the People & the Press sponsored a national survey to gather information about public opinion regarding labor unions and businesses. Interviewers at Princeton Data Source, under the direction of Princeton Survey Research Associates International, conducted telephone interviews during the period February 2–7, 2011, among a national sample of 1,385 adults 18 years of age or older living in the continental United States (952 respondents were interviewed on a landline telephone, while 433 were interviewed on a cell phone, including 197 who had no landline telephone). The interviewers called phone numbers taken from a combination of landline and cell phone random-digit–dial samples provided by Survey Sampling International. The survey asked the 1,385 participants about their opinion of labor unions (favorable, unfavorable or no opinion) and their political affiliation (Republican, independent, Democrat or other). The results from those two questions are summarized in the table below.

            Republican  independent  Democrat  other
favorable          102          215       273     29
unfavorable        190          235       123     21
no opinion          48           61        55     33

a) How many cases are included in this data set?

b) How many variables are included in this data set?

c) What percentage of people surveyed were Democrats?

d) What percentage of people surveyed had a favorable opinion of labor unions?

e) What percentage of Democrats surveyed had a favorable opinion of labor unions?

f) What percentage of people with a favorable opinion of labor unions were Democrats?

g) What percentage of people surveyed were Democrats with a favorable opinion of labor unions?

h) Create an appropriate graphical display for this data.

i) Does opinion about labor unions appear to be independent of political affiliation? Explain.

26. Religion and politics To gather information about voting patterns in the 2012 U.S. presidential election, a group of newspapers and television networks hired Edison Media Research to conduct exit polls by surveying voters at polling locations nationwide on Election Day (November 6, 2012) in addition to conducting telephone interviews with voters who cast their ballot via absentee ballots or early voting. The respondents included 1,493 voters from Washington state interviewed by landline or cell phone from October 29 through November 4. The survey results included the candidate for whom each voter cast his or her ballot (Barack Obama, Mitt Romney or someone else) and how often the voter attends religious services (weekly, occasionally or never). The results of these two questions for the Washington voters appear in the table below.

            Obama Romney other
weekly        163    320    10
occasionally  329    232     6
never         333     82    18

a) How many cases are included in this data set?

b) How many variables are included in this data set?

c) What percentage of people surveyed voted for Romeny?

d) What percentage of people surveyed attend religious services weekly?

e) What percentage of Romney voters surveyed attend religious services weekly?

f) What percentage of people who attend religious services weekly voted for Romney?

g) What percentage of people surveyed voted for Romney and attend religious services weekly?

h) Create an appropriate graphical display for this data.

i) Does candidate preference appear to be independent of religiosity? Explain.

27. Internal clock On December 3, 2010, a statistics student invited 20 acquaintances to participate in a study for her class project. First, she instructed each individual, "Count to 30 seconds as accurately as you can as I keep track on a stopwatch. Once you feel 30 seconds is up, tell me to stop the timer. You may count aloud or silently to yourself." She then repeated this process with the same 20 individuals, instructing each of them the second time, "Now, do it again and if you stop me at exactly 30 seconds, I will give you $5." The times (to the nearest second) for both trials of these 20 individuals, along with the gender for each person, appear below:

gender        M  F  F  F  F  M  M  F  F  M  M  M  F  M  M  F  M  F  M  F
no incentive  36 10 30 40 34 23 36 30 28 28 33 32 25 21 31 27 18 30 15 38
incentive     29 24 33 32 21 28 21 33 39 28 36 32 31 28 21 20 23 31 24 33

a) Create an appropriate graphical display to examine a possible relationship between the time guessed by these students without an incentive and with an incentive.

b) Describe any association you see in your display.

c) Modify your display to include the gender variable as well.

d) Does there appear to be any difference between males and females?

28. coming soon