Chapter 5 Test Bank Comparing Two Proportions - Test Bank + Answers | Statistical Investigations 2e by Nathan Tintle. DOCX document preview.

Chapter 5 Test Bank Comparing Two Proportions

Chapter 5

Introduction to Statistical Investigations Test Bank

Note: TE = Text entry TE-N = Text entry - Numeric

MC = Multiple choice TF = True-False

MS = Multiple select TF = True-False

CHAPTER 5 LEARNING OBJECTIVES

5.1: Compare two sample proportions numerically and graphically.

5.2: Carry out a simulation-based analysis to investigate the difference between two population proportions.

5.2: Carry out a theory-based analysis to investigate the difference between two population proportions.

Section 5.1: Comparing Two Groups: Categorical Response

5.1-1: Organize counts into a two-way table, when data are available on two categorical variables for the same set of observational units.

5.1-2: Calculate conditional proportion of successes, for different categories of the explanatory variable, and use these conditional proportions to decide whether there is preliminary evidence of an association between the explanatory and response variables.

5.1-3: Create a segmented bar chart or mosaic plot to display data available on two categorical variables for the same set of observational units.

5.1-4: Calculate and interpret relative risk.

Questions 1 through 7: Whirling disease is a deadly disease that affects trout in Montana rivers. In a follow-up to a 2006 study conducted by the Montana Department of Fish, Wildlife and Parks (FWP), researchers sought to determine if the proportion of trout afflicted by whirling disease in the Gallatin river differs between rainbow trout and brown trout. To test this theory, researchers collected a representative sample of 527 rainbow trout and 459 brown trout. Of the 527 rainbow trout collected, 120 had developed whirling disease; of the 459 brown trout collected, 74 had developed whirling disease.

  1. Identify the explanatory and response variables, and their types by filling in the blanks below:

The type of trout (rainbow or brown) is the ___(1)___ (explanatory/response) variable and it is ___(2)___ (categorical/quantitative).

Whether or not the trout developed whirling disease is the ___(3)___ (explanatory/response) variable and it is ___(4)___ (categorical/quantitative).

  1. Organize these data into a two-way table:

Rainbow trout

Brown trout

Total

Developed whirling disease

Did not develop whirling disease

Total

Rainbow trout

Brown trout

Total

Developed whirling disease

120

74

120+74 = 194

Did not develop whirling disease

527-120 = 407

459-74 = 385

407+385 = 792

Total

527

459

527+459 = 986

Tol for all answers: +/- 0; LO: 5.1-1; Difficulty: Easy; Type: TE-N

  1. Which of the following plots would be most appropriate to examine the association between the explanatory and response variables?
    1. Segmented bar chart
    2. Scatterplot
    3. Mosaic plot
    4. Segmented bar chart or mosaic plot
  2. Calculate the conditional proportion of trout that developed whirling disease for each species (rainbow or brown):

Rainbow: ___ (1) ___

Brown: ___(2) ___

LO: 5.1-2; Difficulty: Easy; Type: TE-N

  1. Based on the conditional proportions found in question 4, does it appear that whether a trout develops whirling disease is associated with the species of trout (rainbow or brown)?
    1. No, since there were approximately the same number of trout in each species.
    2. Yes, since the number of trout who developed whirling disease differed from the number of trout who did not develop whirling disease.
    3. Yes, since the conditional proportion of trout that developed the disease differed between rainbow and brown trout.
    4. Yes, since the proportion of trout that developed the disease was not equal to 0.5.
  2. Calculate the relative risk of whirling disease for rainbow trout compared to brown trout in this sample.

LO: 5.1-4; Difficulty: Medium; Type: TE-N

  1. Based on the relative risk found in question 6, does it appear that whether a trout develops whirling disease is associated with the species of trout (rainbow or brown)?
    1. Yes, since the relative risk differs from zero.
    2. Yes, since the relative risk differs from one.
    3. No, since the relative risk differs from zero.
    4. No, since the relative risk differs from one.

Questions 8 through 10: The Women’s Health Study included 39,876 female health professionals aged 45 years and older who were followed for an average of 10 years (Ridker et al., 2005). At the beginning of the study, participants were randomly assigned to take either a daily low-dose aspirin or a daily placebo pill for the duration of the study. At the end of the study, researchers measured the number of participants that suffered from a heart attack during the study. Data are summarized in the following table.

Low-dose Aspirin

Placebo

Total

Heart Attack

198

193

391

No Heart Attack

19,736

19,749

39,485

Total

19,934

19,942

39,876

  1. Calculate the relative risk of heart attack for the low-dose aspirin group compared to the placebo group.
    1. 1.026
    2. 0.974
    3. 0.00993
    4. 0.000255
  2. Does there appear to be an association between whether a woman takes low-dose Aspirin or placebo and if she suffers from a heart attack in this sample?
    1. No, since approximately the same number of women suffered from a heart attack in each group.
    2. Yes, women who took low-dose Aspirin were less likely to suffer from a heart attack.
    3. Yes, women who took low-dose Aspirin were more likely to suffer from a heart attack.
    4. We cannot determine whether an association exists without a plot.
  3. What type of plot could be used to plot these data?
    1. Scatterplot
    2. Segmented bar chart
    3. Dotplot
    4. Histogram

Questions 11 through 13: A Gallup poll headline from April 25, 2013, reads “In U.S., Women Veterans Rate Lives Better Than Men”. In a random sample of 900 female veterans interviewed, 459 rated their lives as “thriving.” Only 693 male veterans rated their lives as “thriving” in a random sample of size 1,650.

  1. Organize these data into a two-way table:

Rated their life as “thriving”?

Yes

No

Total

Female veterans

Male Veterans

Total

Rated their life as “thriving”?

Yes

No

Total

Female veterans

459

900-459 = 441

900

Male Veterans

693

1650-693 = 957

1,650

Total

459+693 = 1,152

441+957 = 1,398

900+1650 = 2,550

Tol for all answers: +/- 0; LO: 5.1-1; Difficulty: Easy; Type: TE-N

  1. For each sample (females and males), calculate the sample proportion who rated their lives as “thriving.”

Females: ________

Males:________

Females: 459/900 = 0.51; Tol: +/- 0

Males: 693/1650 = 0.42; Tol: +/- 0

LO: 5.1-2; Difficulty: Easy; Type: TE-N

  1. The relative risk of a “thriving” life rating for female veterans compared to male veterans is 1.214. How would you interpret this value?
    1. The chance a female veteran rated her life as “thriving” is about 21% lower than the chance a male veteran rated his life as “thriving.”
    2. The chance a female veteran rated her life as “thriving” is about 21% higher than the chance a male veteran rated his life as “thriving.”
    3. The chance a female veteran rated her life as “thriving” is about 121% lower than the chance a male veteran rated his life as “thriving.”
    4. The chance a female veteran rated her life as “thriving” is about 121% higher than the chance a male veteran rated his life as “thriving.”

Questions 14 and 15: An advertisement for Claritin, a drug for seasonal nasal allergies, made this claim: “Clear relief without drowsiness. In studies, the incidence of drowsiness was similar to placebo” (Time, February 6, 1995, p. 43). The advertisement also reported that 8% of the 1,926 Claritin takers and 6% of the 2,545 placebo takers reported drowsiness as a side effect.

  1. Does there appear to be an association between whether one takes Claritin or a placebo and the development of drowsiness in this sample?
    1. Yes, since 1,926 differs from 2,545.
    2. Yes, since 8% differs from 6%.
    3. No, since the sample size was not large enough.
    4. No, since the claim stated the incidence of drowsiness when taking Claritin was similar to that when taking a placebo.
  2. Calculate the relative risk of drowsiness for Claritin users compared to placebo.
    1. 0.02
    2. 1.02
    3. 0.75
    4. 1.33

Section 5.2: Comparing Two Proportions: Simulation-Based Approach

5.2-1: State the null and the alternative hypotheses in terms of “no association” versus “there is an association” as well as in terms of comparing probabilities of success for two categories of the explanatory variable (i.e., π1 and π2) when exploring the relationship between two categorical variables.

5.2-2: Implement the 3S strategy: find a statistic, simulate, and compute the strength of evidence against observed study results happening by chance alone.

5.2-3: Describe how to use cards to simulate what outcomes (in terms of difference in conditional proportions and/or relative risk) are to be expected in repeated random assignments, if there is no association between the two variables.

5.2-4: Use the Two Proportions applet to conduct a simulation of the null hypothesis and be able to read output from the Two Proportions applet.

5.2-5: Find and interpret the standardized statistic and the p-value for a test of two proportions.

5.2-6: Use the 2SD method to find a 95% confidence interval for the difference in long-run proportion of success for two “treatment” groups, and interpret the interval in the context of the study.

5.2-7: Interpret what it means for the 95% confidence interval for difference in proportions to contain zero.

5.2-8: State a complete conclusion about the alternative hypothesis (and null hypothesis) based on the p-value and/or standardized statistic and the study design, including statistical significance, estimation (confidence interval), generalizability, and causation.

Questions 16 through 22: The article “Freedom of What?” (Associated Press, February 1, 2005) described a study in which high school students and high school teachers were asked whether they agreed with the following statement: “Students should be allowed to report controversial issues in their student newspapers without the approval of school authorities.” Researchers hypothesized that the long-run proportion of high school teachers who would agree with the statement would differ from the long-run proportion of high school students who would agree. Two random samples – 8,000 high school teachers and 10,000 high school students – were selected from high schools in the U.S. It was reported that 39% of the teachers surveyed and 58% of the students surveyed agreed with the statement.

  1. State the null and alternative hypotheses.
    1. versus
    2. versus
    3. versus
    4. versus
  2. What is the value of the statistic and its appropriate notation?
  3. Fill in the table below with the inputs to the Two Proportion applet to conduct a simulation of the null hypothesis.

YawnSeed

NoSeed

Yawn

No Yawn

YawnSeed

NoSeed

Yawn

8000*.39 = 3120

10000*.58 = 5800

No Yawn

8000 – 3120 = 4880

10000 – 5800 = 4200

LO: 5.2-4; Difficulty: Medium; Type: TE-N

  1. Describe how you could use cards to simulate a single sample statistic in the null distribution.
    1. Take 8,000 red cards and 10,000 blue cards. Shuffle the cards together, then deal the cards into two piles – one of size 8,000 and one of size 10,000. Compute the difference in the proportion of red cards between the two piles.
    2. Take 8,000 red cards and 10,000 blue cards. Shuffle the cards together, then deal the cards into two piles – one of size 3,120 and one of size 5,800. Compute the difference in the proportion of red cards between the two piles.
    3. Take 8,920 red cards and 9,080 blue cards. Shuffle the cards together, then deal the cards into two piles – one of size 8,000 and one of size 10,000. Compute the difference in the proportion of red cards between the two piles.
    4. Take 8,920 red cards and 9,080 blue cards. Shuffle the cards together, then deal the cards into two piles – one of size 4,640 and one of size 3,900. Compute the difference in the proportion of red cards between the two piles.
  2. A simulated null distribution of 1,000 differences in proportions created by using the Two Proportion applet is shown below.

A histogram depicts the results of simulated null distribution. The horizontal axis is labeled Shuffled Difference in Proportions and has markings from negative 0.026 to 0.024 in increments of 0.01. The vertical axis is labeled Number of Shuffles and has markings from 5 to 25 in increments of 5. The distribution of the densely plotted bars is approximately bell-shaped. The series of bars begins at negative 0.021 on the horizontal axis. The heights of the bars increase gradually from the left with a few ups and downs and reach the peak of 20 at negative 0.001 on the horizontal axis. Then the heights of the bars decrease gradually to the right with a few ups and downs and end at 0.024 on the horizontal axis. The mean is negative 0.000, the standard deviation is 0.007, and the total shuffles is 1000. All values are approximate.

What is the strength of evidence against observed study results happening by chance alone?

    1. Very strong
    2. Moderate
    3. Weak
    4. We cannot determine the strength of evidence from this plot.
  1. A simulated null distribution of 1,000 differences in proportions created by using the Two Proportion applet is shown below.

A histogram depicts the results of simulated null distribution. The horizontal axis is labeled Shuffled Difference in Proportions and has markings from negative 0.026 to 0.024 in increments of 0.001. The vertical axis is labeled Number of Shuffles and has markings from 5 to 25 in increments of 5. The distribution of the densely plotted bars is approximately bell-shaped. The series of bars begins at negative 0.021 on the horizontal axis. The heights of the bars increase gradually from the left with a few ups and downs and reach the peak of 20 at negative 0.001 on the horizontal axis. Then the heights of the bars decrease gradually to the right with a few ups and downs and end at 0.024 on the horizontal axis. The mean is negative 0.000, the standard deviation is 0.007, and the total shuffles is 1000. All values are approximate.

Use the 2SD method to find a 95% confidence interval for the difference in long-run proportion who would agree with the statement between Students and Teachers (Students – Teachers).

(____(1)____, ____(2)____)

LO: 5.2-6; Difficulty: Medium; Type: TE-N

  1. The study concluded that “in the U.S., a higher percentage of high school students believe controversial issues should be reported without approval of school authorities than high school teachers.” Is this conclusion justified?
    1. No, since we did not survey all high school students and high school teachers in the U.S.
    2. Yes, because the sample sizes were large.
    3. No, because the 2SD 95% confidence interval for the difference in long-run proportion who would agree with the statement between Students and Teachers (Students – Teachers) is entirely greater than zero.
    4. Yes, because the 2SD 95% confidence interval for the difference in long-run proportion who would agree with the statement between Students and Teachers (Students – Teachers) is entirely greater than zero.

Questions 23 through 28: To investigate biases against women in personnel decisions, psychologists performed a randomized experiment on 50 male bank supervisors attending a management institute who volunteered for the study. The supervisors were asked to make a decision on whether to promote a hypothetical applicant based on a personnel file. For 26 of them, the application file described a female candidate; for the others it described a male. The files were identical in all other respects. Results on the promotion decisions for the two groups are shown below.

Applicant’s Sex

Promoted?

Male

Female

Total

Yes

21

14

35

No

3

12

15

Total

24

26

50

  1. What is the alternative hypothesis, in words?
    1. There is no bias against women in personnel decisions.
    2. There is a bias against women in personnel decisions.
  2. What is the value of the statistic and its appropriate notation?
  3. The standardized statistic for this study is 2.59. How would you interpret this value?
    1. The sample proportion of men promoted is 2.59 standard deviations above the sample proportion of women promoted.
    2. The sample difference in proportion promoted between men and women (men – women) is 2.59 standard deviations above zero.
    3. On average, sample differences in proportion promoted between men and women (men – women) are 2.59 standard deviations from zero.
    4. There is a 2.59% chance of seeing a sample difference in proportion promoted between men and women (men – women) of the one observed or greater.
  4. The p-value for this study is 0.009. How would you interpret this value?
    1. There is a 0.9% probability that there is no bias against women in personnel decisions.
    2. There is a 0.9% probability that there is a bias against women in personnel decisions.
    3. If there were no bias against women in personnel decisions, there is a 0.9% probability of seeing a sample difference in proportion promoted between men and women (men – women) of 0.337 or greater.
    4. If there were a bias against women in personnel decisions, there is a 0.9% probability of seeing a sample difference in proportion promoted between men and women (men – women) of 0.337 or greater.
  5. The standardized statistic for this study is 2.59, and the p-value is 0.009. State a conclusion of the test in context of the problem.
    1. We do not have significant evidence that the probability an applicant is promoted is higher for women than for men.
    2. We do not have significant evidence that the probability an applicant is promoted is higher for men than for women.
    3. We have significant evidence that the probability an applicant is promoted is higher for women than for men.
    4. We have significant evidence that the probability an applicant is promoted is higher for men than for women.
  6. Based on the study design, what is the scope of inference for these results?
    1. The sex of an applicant causes a difference in the probability of being promoted by individuals similar to those in the study.
    2. The sex of an applicant causes a difference in the probability of being promoted by all male bank supervisors attending a management institute.
    3. The sex of an applicant is associated with whether or not the individual is promoted by individuals similar to those in the study.
    4. The sex of an applicant is associated with whether or not the individual is promoted by all male bank supervisors attending a management institute.
  7. True or false: The p-value for a test of two proportions is the probability that the two long-run proportions are equal.
  8. True or false: A simulated null distribution of a difference in sample proportions will be centered at the value of the difference in proportions in the observed data.
  9. A Gallup poll headline from April 25, 2013, reads “In U.S., Women Veterans Rate Lives Better Than Men”. In a random sample of 900 female veterans interviewed, 459 rated their lives as “thriving.” Only 693 male veterans rated their lives as “thriving” in a random sample of size 1,650. What is the null hypothesis if researchers are interested in determining if the long-run proportion who would rate their lives as thriving differs between women and men?
    1. There is an association between sex and whether a person would rate their life as thriving.
    2. There is no association between sex and whether a person would rate their life as thriving.
    3. There is an association between whether a person is a veteran and whether a person would rate their life as thriving.
    4. There is no association between whether a person is a veteran and whether a person would rate their life as thriving.

Section 5.3: Comparing Two Proportions: Theory-Based Approach

5.3-1: Identify when a theory-based approach would be valid to find the p-value or a confidence interval when evaluating the relationship between two categorical variables.

5.3-2: Use the Theory-Based Inference applet to find theory-based p-values and confidence intervals.

5.3-3: Understand the impacts of confidence level and sample size on confidence interval width for a confidence interval on the difference in two proportions.

Problems 32 through 36: The Women’s Health Study included 39,876 female health professionals aged 45 years and older who were followed for an average of 10 years (Ridker et al., 2005). At the beginning of the study, participants were randomly assigned to take either a daily low-dose aspirin or a daily placebo pill for the duration of the study. At the end of the study, researchers measured the number of participants that suffered from a heart attack during the study. Data are summarized in the following table.

Low-dose Aspirin

Placebo

Total

Heart Attack

198

193

391

No Heart Attack

19,736

19,749

39,485

Total

19,934

19,942

39,876

  1. Is a theory-based approach appropriate to evaluate the relationship between whether an individual took aspirin or a placebo and whether an individual suffers a heart attack?
    1. No, since a simulation-based approach is always appropriate.
    2. Yes, since 39,876 is larger than 20.
    3. Yes, since 391 and 39,485 are both greater than 10.
    4. Yes, since 198, 193, 19,736, 19,749 are all greater than 10.
  2. Use the Theory-Based Inference applet to find a 90% confidence interval for the difference in probability of suffering a heart attack between the two groups (aspirin – placebo).

(­­___(1)___, ___(2)___)

LO: 5.3-2; Difficulty: Medium; Type: TE-N

  1. Based off of your confidence interval in question 33, what could you say for sure about the p-value for testing versus ?
    1. p-value < 0.01
    2. p-value < 0.05
    3. p-value < 0.10
    4. p-value > 0.10
  2. If we had calculated a 95% confidence interval in question 33 rather than a 90% confidence interval, all else being equal, would the resulting interval be wider, narrower, or the same width?
    1. Wider
    2. Narrower
    3. Same width
  3. If our sample size had been 10,000 rather than 39,876, all else being equal, would the 90% confidence interval be wider, narrower, or the same width as the one found in question 33?
    1. Wider
    2. Narrower
    3. Same width

Questions 37 through 43: A 2003 study reported in the Journal of Consumer Affairs examined how well consumers protect themselves from identity theft. The study surveyed a random sample of 61 college students and 59 non-students, and asked each participant, “Have you used personal information (such as birth date, pet name, etc.) when creating a password?” For the students, 22 agreed with this statement, while 30 of the non-students agreed.

  1. Calculate the difference between the proportion of college students that agreed with the statement and the proportion of nonstudents that agreed with the statement in the study (student – non-student).

LO: 5.3-2, Difficulty: Easy; Type: TE-N

  1. Is a theory-based approach appropriate to test versus ?
    1. No, since a simulation-based approach is always appropriate.
    2. Yes, since 120 is larger than 20.
    3. Yes, since 61 and 59 are both greater than 20.
    4. Yes, since at least 10 people agreed and at least 10 people disagreed in each sample.
  2. Use the Theory-Based Inference applet to find the p-value for a test of test versus .
    1. p-value = 0.0512
    2. p-value = 0.1023
    3. p-value = 0.1480
    4. p-value = 0.9488
  3. Use the Theory-Based Inference applet to find a 95% confidence interval for the difference in probability of agreeing with the statement in the study between the two groups (student – non-student).

(­­___(1)___, ___(2)___)

LO: 5.3-2; Difficulty: Medium; Type: TE-N

  1. What does it mean to have 95% confidence in the interval you created in question 40?
    1. There is a 95% chance that the true difference in proportions is contained in the interval calculated in question 40.
    2. In 95% of all possible random samples, the interval computed from the sample will contain the sample difference in proportions.
    3. There is a 95% chance that the sample difference in proportions is contained in the interval calculated in question 40.
    4. In 95% of all possible random samples, the interval computed from the sample will contain the true difference in proportions.
  2. If we increased the confidence level from 95% to 99%, all else remaining the same, the width of the confidence interval would
    1. increase.
    2. remain the same.
    3. decrease.
  3. If we increased the confidence level from 95% to 99%, all else remaining the same, the center of the confidence interval would
    1. increase.
    2. remain the same.
    3. decrease.

Questions 44 through 46: Hepatitis C is a blood-born viral infection that causes liver inflammation and infection that, over time, can lead to liver disease. There is no vaccine against this strain of hepatitis, so preventive measures are the only management techniques. One of the ways hepatitis can be transmitted is by use of improperly sterilized tattoo equipment or contaminated dyes, which in turn has led to more stringent sterilization requirements for commercial tattoo parlors. Researchers at the University of Texas Southwestern Medical Center examined the medical records of 113 patients who had a tattoo to see whether these sterilization requirements at commercial parlors are reducing the proportion of hepatitis C among those with tattoos, compared to those who get tattoos elsewhere. Data are summarized in the following table.

Commercial parlor

Elsewhere

Total

Has Hep-C

10

15

25

Does not have Hep-C

60

28

88

Total

70

43

113

  1. Is a theory-based approach appropriate to evaluate the relationship between the incidence of hepatitis C and the type of tattoo parlor?
    1. No, since the number with hepatitis C in each group is less than 20.
    2. Yes, since 113 is larger than 20.
    3. Yes, since 70 and 43 are both greater than 20.
    4. Yes, since 10, 15, 60, 28 are all at least 10.
  2. Use the Theory-Based Inference applet to find the p-value for a test of test versus .
    1. p-value = 0.0104
    2. p-value = 0.0052
    3. p-value = 0.2060
    4. p-value = 0.9948
  3. Use the Theory-Based Inference applet to find a 95% confidence interval for the difference in probability of developing hepatitis C between the two types of parlors (commercial – elsewhere).

(­­___(1)___, ___(2)___)

LO: 5.3-2; Difficulty: Medium; Type: TE-N

Document Information

Document Type:
DOCX
Chapter Number:
5
Created Date:
Aug 21, 2025
Chapter Name:
Chapter 5 Comparing Two Proportions
Author:
Nathan Tintle

Connected Book

Test Bank + Answers | Statistical Investigations 2e

By Nathan Tintle

Test Bank General
View Product →

$24.99

100% satisfaction guarantee

Buy Full Test Bank

Benefits

Immediately available after payment
Answers are available after payment
ZIP file includes all related files
Files are in Word format (DOCX)
Check the description to see the contents of each ZIP file
We do not share your information with any third party