Ch3 Test Bank Docx Intermediate Statistical Investigations - Intermediate Statistical Investigations 1st Ed - Exam Bank by Nathan Tintle. DOCX document preview.

Ch3 Test Bank Docx Intermediate Statistical Investigations

Chapter 3

Intermediate Statistical Investigations Test Bank

Question types: FIB = Fill in the blank Calc = Calculation

Ma = Matching MS = Multiple select

MC = Multiple choice TF = True-false

CHAPTER 3 TERMINAL LEARNING OUTCOMES

TLO3-1: Design and analyze a multi-factor experiment realizing the difference between blocking variables and experimental factors.

TLO3-2: Understand the concept of a statistical interaction, and calculate and interpret interaction effects.

TLO3-3: Explain the benefits and challenges of replication particularly as they relate to generalized block designs and within block factorial designs.

TLO3-4: Run a two-variable ANOVA including an interaction term for an observational study with 2 two-level variables.

Section 3.1: Multifactor Experiments

LO3.1-1: Design an experiment with more than one variable of interest.

LO3.1-2: Explore the benefits of a two-variable study where the levels of both variables are assigned by the researcher.

  1. A researcher is interested in the relative difficulty of two sets of mathematical tasks and how student success in completing the tasks may be affected by distractions. Forty-eight college students have been recruited to participate in a study. Which of the following is the most appropriate study design?
    1. Randomly assign the students into two groups of size 24. One group completes Task Set 1 with distractions, and the other completes Task Set 2 with no distractions.
    2. Randomly assign the students into four groups of size 12. One group completes Task Set 1 with distractions, one completes Task Set 1 without distractions, one completes Task Set 2 with distractions, and the last completes Task Set 2 without distractions.
    3. Divide the students into two groups of 24. Within Group 1, assign 12 students to complete Task Set 1 and 12 students to complete Task Set 2. Within Group 2, assign 12 students to work with distractions and 12 to work without distractions.
    4. All three study designs above are equally appropriate ways to investigate the researcher’s statistical questions.

Questions 2 through 5: In a balanced, full-factorial experiment, 40 volunteers were randomly assigned to receive one of two pain medications (A, B) at one of two dosages (high, low). After an hour, their response to pain was recorded on a scale of 0 to 20 (with higher numbers indicating more severe pain). The ANOVA table below corresponds to a two-variable additive model using drug and dose as predictors of pain response.

Source

DF

SS

MS

F

p-value

Drug

1

16.9

16.9

3.97

0.0537

Dose

1

313.6

22.22

73.67

<0.0001

Error

37

157.5

4.26

Total

39

488.0

  1. Calculate SSModel.
    1. SSModel = 16.9 + 313.6 = 330.5
    2. SSModel = (16.9 + 313.6) / 2 = 165.25
    3. SSModel = 16.9 + 313.6 – 157.5 = 173.0
    4. There is not enough information to calculate SSModel, because an association between Drug and Dosage may lead to covariation.
  2. True or False: The 95% confidence interval for estimating the difference in mean pain response for low and high doses includes 0.
  3. Suppose the researchers had ignored dosage in the analysis. What values would you expect in the ANOVA table for a one-variable model using only drug as a predictor of pain response?

Source

DF

SS

MS

F

p-value

Drug

Error

Total

SSDrug in the one-variable ANOVA table would be _________ (>, <, =) 16.9

The p-value for Drug in the one-variable ANOVA table would _________ (>, <, =) 0.0537, indicating _________ (stronger/weaker/the same) evidence of a difference between Drug A and Drug B.

  1. Suppose the researchers had ignored dosage in the analysis. How would this affect the width of the 95% confidence interval for estimating ?

The 95% confidence interval for estimating would be ________________ (wider/narrower) than the confidence interval based on the two-variable model, because the ________________ (treatment means/residual standard error) would be different.

Questions 6 through 9: A large field was divided into 24 equal-sized plots to be planted with the same number of potato plants. Each plot was randomly assigned a type of fertilizer (A, B, C) and a manure level (high, low) based on a balanced, full-factorial design. The table below shows the mean yield (by weight) for each combination of fertilizer and manure.

Fertilizer A

Fertilizer B

Fertilizer C

Low Manure

6

8

10

High Manure

7

11

12

  1. True or False: By using a balanced design, the researchers have ensured that there is no association between the type of fertilizer and the level of manure.
  2. Fill in the blanks using numerical values.

In this experiment, there are ________ explanatory variables (or factors) and there are ________ treatments with ________ replications.

  1. Calculate the main effects of fertilizer and manure to fill in the additive statistical model.
  2. For the additive model that uses fertilizer type and manure level to predict yield, the standard error of the residuals is 4.5. What does the standard error of the residuals measure?
    1. The variability of individual plot yields around the observed mean yields given in the table
    2. The variability of individual plot yields around the predicted mean yields calculated based on the additive model
    3. The variability between the observed mean yields given in the table and the predicted mean yields calculated based on the model
    4. The variability between the observed mean yields given in the table, the predicted mean yields calculated based on the model, and the overall mean yield

Questions 10 through 13: Researchers want to reduce potato rot while potatoes are being stored for future use. In an experiment, potatoes were injected with a bacteria known to cause rot and then stored under a variety of conditions. Two of the experimental factors were temperature during storage and amount of oxygen during storage. The response variable was the diameter of the rotted area (in millimeters). A partially filled in two-variable ANOVA table is given below.

Source

DF

SS

MS

F

p-value

Temperature

1

600.89

600.89

?

0.0003

Oxygen

2

44.44

22.22

0.4483

Error

14

365.77

26.14

Total

17

1011.11

  1. Fill in the blanks using numerical values.

Based on the ANOVA table, there were ________ levels of temperature and ________ levels of oxygen tested in this experiment. There were ________ replications for each combination of temperature and oxygen.

  1. Calculate the F-statistic for testing whether temperature has an effect on rot.

Solution:

  1. What conclusions would you draw based on the p-values in this ANOVA table?

After adjusting for oxygen, this study provides _______ (strong/weak) evidence that temperature has an effect on rot.

After adjusting for temperature, this study provides _______ (strong/weak) evidence that oxygen level has an effect on rot.

  1. Initially, the researchers had planned to use a one-variable model with separate means for each combination of temperature and oxygen. Which of the following are advantages of a two-variable additive model over a one-variable separate means model? Select all that apply.
    1. The two-variable model allows us to estimate the main effects of temperature and oxygen separately.
    2. The two-variable model explains more of the variability (higher SSModel) compared to the separate means model.
    3. The two-variable model has fewer degrees of freedom for treatment and more degrees of freedom for error.
    4. The two-variable model has more degrees of freedom for treatment and fewer degrees of freedom for error.
  2. An experiment was conducted to compare two different applications for text messaging on cell phones, each with its own interface and software. The experiment included participants in two age groups: 15-44 years old and 45+ years old. Within each age group, half of the participants were randomly assigned to each application. The researchers then asked each participant to type the words from a given passage of text and recorded the number of words typed correctly in a fixed period of time. Describe the explanatory variables in this study.
    1. This study has two explanatory variables: one blocking variable and one experimental factor.
    2. This study has two explanatory variables, both of which are experimental factors.
    3. This study has four explanatory variables: two blocking variables and two experimental factors.
    4. This study has four explanatory variables, all of which are experimental factors.
  3. The plots below display the residuals for a two-factor model with six treatments. Match each validity condition to the description of how that condition should be checked. Two of the descriptions will not be used.

"A dotplot and a histogram describe residual plots for two factor models. The dotplot has the horizontal axis labeled Fitted values and has markings 5 to 12.5 in increments of 2.5. The vertical axis is labeled Residual and has markings from negative 4 to 6 in increments of 2. For fitted value 5.5, the dots are plotted as follows: 1 dot above negative 2.1, negative 1.2, negative 0.1, 0.9, 3.9, and 4.9 on the vertical axis. For fitted value 6, the dots are plotted as follows: 1 dot above negative 2, negative 1, 0, 1, and 2. For fitted value 7.25, the dots are plotted as follows: 1 dot above negative 4, 1 dot above negative 4, negative 1, 0, 1, 2, and 6. For fitted value 8.75, the dots are plotted as follows: 1 dot above negative 4, negative 3.1, negative 0.9, 0.1, 1.2, and 2.1. For fitted value 10.5, the dots are plotted as follows: 1 dot above negative 3.5, negative 1.5, negative 0.5, 0.5, 1.5, 2.5, and 4.5. For fitted value 12.4, the dots are plotted as follows: 1 dot above negative 2, negative 1.1, negative 0.9, 0.9, 1.9, and 3.9. A highlighted horizontal line starts from 0 on the vertical axis and extends toward the right passing through the dot at the fitted value 0 on the vertical axis. All values are approximate. 
To the right of the dotplot, is a histogram. The histogram has the horizontal axis labeled Residual and has markings from negative 4 to 6 in increments of 2. The distribution of the bars is approximately normal and it starts from negative 4, and ends at 6. The longest bars are at negative 2, negative 1, and 1 on the horizontal axis and the bars decrease in height to the left of negative 2 and to the right of 1. All values are approximate."

Independence: A. Check that fitted values are spaced fairly evenly along the x-axis.

Equal variance: B. Check that the vertical spread of the residuals at each of the fitted values is reasonably similar.

Normality: C. Check that the histogram of the residuals is reasonably symmetric and bell-shaped.

D. Check that the mean of the residuals is 0.

E. Check that experimental units were randomly assigned to treatments with no repeated measures.

Section 3.2: Statistical Interactions

LO3.2-1: Understand the concept of a statistical interaction.

LO3.2-2: Interpret an interaction plot.

LO3.2-3: Calculate interaction effects.

LO3.2-4: Use simulation- and theory-based p-values to assess the significance of an interaction.

An interaction plot describes the interaction between protein source and level on the mean weight gain. The horizontal axis is labeled level and has the following markings: low and high, in the order form left to right. The vertical axis is labeled mean weight gain and has markings from 70 to 110 in increments of 10. A blue line denotes cereal and a red line denotes beef. The dots plotted are as follows: For low: a red dot is at 78 and a blue dot is at 84. For high: a blue dot is at 85 and a red dot is at 100. A blue line increasing toward the right connect the blue dots and a red line increasing upward to the right connect the red dots. The blue and the red line intersect each other at the point, 84 on the vertical axis. All values are approximate. All values are approximate.Questions 1 through 6: A balanced, full-factorial experiment was used to investigate the effect of protein source (beef, cereal) and protein level (high, low) on weight gain (in grams) for male rats. Ten rats were assigned to each combination of source and level. A table of mean weight gains and an interaction plot are shown below.

Level

Source

Low

High

Beef

79.2

100

Cereal

83.9

85.9

  1. Suppose the researchers used an additive model to predict weight gain, as shown below.

According to the interaction plot, which treatment is A. Beef/Low expected to cause the smallest amount of weight gain? B. Beef/High

According to the additive model, which treatment is C. Cereal/Low
expected to cause the smallest amount of weight gain? D. Cereal/High

  1. Calculate the interaction effect for the Beef/Low treatment.

Sol: ;

  1. Calculate SSInteraction.

Sol: ;

  1. Which of the following statements are appropriate descriptions of the statistical interaction in this sample? Select all that apply.
    1. Weight gain increases when rats are fed a high protein diet instead of a low protein diet.
    2. When beef is the source, higher protein levels lead to higher weight gains, whereas when cereal is the source, higher protein levels do not have much impact.
    3. Beef has higher protein levels than cereal, and thus, beef has a positive effect on weight gain.
    4. Beef and cereal sources are very different in terms of weight gain when the protein level is high, but they are similar when the protein level is low.

A histogram depicts the results of simulating 10,000 trials. The horizontal axis is labeled Shuffled difference in means and has markings from negative 40 to 40 in increments of 20. The distribution of the bars is approximately normal. The bars are distributed between the points, negative 30 and 38 on the horizontal axis. The bars at negative 30 and 38 are very short and they almost collide with the horizontal axis. There are 9 bars from negative 26 to 0 which are arranged in an increasing trend and there are 10 bars from 2.5 to 31 which are arranged in a decreasing trend. All values are approximate.

  1. Use the 3S Strategy to test the significance of the interaction based on the difference in the differences statistic. The histogram to the right shows the results of simulating 10,000 trials under the assumption of no interaction between source and level.

Based on the null distribution and difference in the differences for the sample data, what do you conclude?

If there were really ______ (an interaction/no interaction), we’d get a difference in the differences as or more extreme as our observed statistic about ________ (0.5, 7, or 50)% of the time. This provides ___________ (moderately/very) strong evidence of an interaction between source and level.

  1. The table below shows all possible pairwise confidence intervals for the four treatments.

Groups Compared

95% Confidence Intervals

Beef/High – Beef/Low

(7.23, 34.36)

Beef/High – Cereal/Low

(2.53, 29.66)

Beef/High – Cereal/High

(0.54, 27.66)

Cereal/High – Beef/Low

(-6.86, 20.26)

Cereal/Low – Beef/Low

(-8.86, 18.26)

Cereal/High – Cereal/Low

(-11.56, 15.56)

What conclusion can you draw from the table? Note that the term “significant” in these statements refers to statistical significance not practical importance.

    1. Beef/High is significantly different from every other treatment in this study.
    2. Cereal/Low is significantly different from every other treatment in this study.
    3. Beef/High and Beef/Low are the only two treatments that are significantly different from each other.
    4. Cereal/High and Cereal/Low are the only two treatments that are significantly different from each other.

Questions 7 through 11: In a balanced, full-factorial experiment, 60 volunteers were randomly assigned to receive one of three pain medications (A, B, C) at one of two dosages (high, low). After an hour, their response to pain was recorded on a scale of 0 to 20 (higher numbers indicate more severe pain).

  1. Suppose there were no interaction between pain medication and dosage. How would you complete the table of means below?

Drug A

Drug B

Drug C

Low Dose

10.4

12.2

8.8

High Dose

5.3

?

?

  1. Which of the following statements are appropriate descriptions of the statistical interaction in this sample?

An interaction plot describes the interaction between the drug dosage level on the mean pain response. The horizontal axis is labeled Drug and has the following markings: A, B, and C, in the order form left to right. The vertical axis is labeled Mean pain response and has markings from 2 to 14 in increments of 2. A blue line denotes high and a red line denotes low. The dots plotted are as follows: For A: a blue dot is at 5.3 and a red dot is at 10.2. For B: a blue dot is at 5.95 and a red dot is at 12.1. For C: a blue dot is at 6.8 and a red dot is at 8.6. A blue line increasing upward to the right connect the blue dots and a red line increasing upward to the right and then decreasing to the right connect the red dots. All values are approximate.

    1. The mean pain response for low dosages is always higher than the mean pain response for high dosages.
    2. Drug A and Drug B are usually prescribed at low dosages. Drug C is prescribed at a high dosage nearly half the time.
    3. The dosage effect is not the same for all three drugs. The dosage effect for Drug C is much smaller than for Drug A or Drug B.
    4. All of the statements above are appropriate descriptions of the statistical interaction.
  1. The researchers decided to use a two-variable model with an interaction.

True or False: The predicted pain responses based on this model will be equal to the observed treatment means for each of the six drug-dose combinations

  1. The partially filled in ANOVA table below corresponds to the two factor model with interaction.

Source

DF

SS

MS

F

p-value

Drug

21.70

10.85

2.27

0.1135

Dose

281.67

281.67

58.82

<0.0001

Interaction

50.63

0.008

Error

258.60

Total

612.60

Calculate the F-statistic for the interaction.

Sol:

  1. The p-value for testing the interaction between drug and dosage is 0.008. What do you conclude? You may assume .

This study provides ___________ (strong/weak) evidence that there ________ (is/is not) an interaction between drug and dosage.

  1. You plan to use a theory-based p-value to test the interaction in a 2x2 full-factorial design. Does the residual plot below indicate a violation of the validity conditions?

A dotplot describes the residual plots for two factor models. The dotplot has the horizontal axis labeled Fitted values and has markings 5 to 20 in increments of 5. The vertical axis is labeled Residual and has markings from negative 4 to 4 in increments of 2. For fitted value 9, the dots are plotted as follows: 1 dot above negative 3.1, negative 2.1, 0.9, 2 and 3 on the vertical axis. For fitted value 13, the dots are plotted as follows: 1 dot above negative 3, negative 1, 0, 1, and 3. For fitted value 13.5, the dots are plotted as follows: 1 dot above negative 2.6, negative 1.6, 0.4, 1.4, and 2.5. For fitted value 16.25, the dots are plotted as follows: 1 dot above negative 3.5, negative 2.6, negative 0.5, 1.5 and 3.9. A highlighted horizontal line starts from 0 on the vertical axis and extends toward the right passing through the dot at the fitted value 0 on the vertical axis. All values are approximate. 

    1. Yes, because the residuals are centered at 0, so the independence condition is violated.
    2. Yes, because data points appear in four stacks, so the normality condition is violated.
    3. Yes, because the points are not evenly distributed along the x-axis, so the equal variance condition is violated.
    4. No, this graph does not indicate any major violation of the validity conditions.
  1. Below are two interaction plots for two factors (A and B) and a quantitative response
    variable.

"Two interaction plots for two factors A and B. The first plot is titled, Graph 1 with the horizontal axis labeled factor B has the following markings, 0 and 1, in the order from left to right. The vertical axis is labeled Mean and has markings from 20 to 50 in increments of 5. A blue line denotes 0 and a dashed red line denotes 1. The dots plotted are as follows: For 0: a blue dot is at 30 and a red dot is at 40. For 1: a blue dot is at 20 and a red dot is at 50. A blue line decreasing to the right connect the blue dots and a dashed red line increasing upward to the right connect the red dots. All values are approximate.
The second plot is titled, Graph 2 with the horizontal axis labeled factor B has the following markings, 0 and 1, in the order from left to right. The vertical axis is labeled Mean and has markings from 20 to 50 in increments of 5. A blue line denotes 0 and a dashed red line denotes 1. The dots plotted are as follows: For 0: a red dot is at 40 and a blue dot is at 45. For 1: a blue dot is at 44.7 and a red dot is at 50. A blue line decreasing to the right connect the blue dots and a dashed red line increasing upward to the right connect the red dots. The blue and the dashed red line intersect each other at the point, 44.9 on the vertical axis. All values are approximate."

Assuming the sample sizes are the same, which graph corresponds to a smaller p-value for the interaction?

    1. Graph 1 would correspond to a smaller p-value for the interaction.
    2. Graph 2 would correspond to a smaller p-value for the interaction.
    3. Both samples would result in the same p-value for the interaction.
    4. These graphs to not provide enough information to decide which sample would correspond to a smaller p-value for the interaction.
  1. A large field was divided into 24 equal-sized plots to be planted with the same number of potato plants. Each plot was randomly assigned a type of fertilizer (A, B, C) and a manure level (high, low) based on a balanced, full-factorial design.

True or False: By using a balanced design, the researchers have ensured that there is no interaction between the type of fertilizer and the level of manure.

  1. True or False: When there is a substantial interaction between Factor A and Factor B in a two-factor design, the researcher should interpret the main effects of Factor A and Factor B separately.

Section 3.3: Replication

LO3.3-1: Explain the benefits and challenges of replication.

LO3.3-2: Define and describe advantages of a generalized block design.

LO3.3-3: Define and describe advantages of within-blocks factorial designs.

Questions 1 through 3: Concentration is a one-person memory game in which cards are laid face down on a surface and two cards are flipped face up at a time. The object of the game is to turn over matching pairs of cards.

An online version of this game includes three different sets of cards: one has images of animals on the cards, one has images of babies, and one has images of holiday scenes. Are these three variations equally difficult? To investigate, eight students tried all three versions of the game in random order. They recorded the amount of time (in seconds) it took to complete the game.

  1. How would you describe this design?
    1. Randomized complete block design with repeated measures
    2. Generalized block design with replication
    3. Full factorial design with independent groups
    4. Within-block factorial design with two experimental factors
  2. Suppose you want to test for a person-version interaction as part of the analysis. Fill in the degrees of freedom in the ANOVA table below.

Source

DF

SS

MS

F

Version

?

Person

?

Version Person

?

Error

?

Total

?

  1. True or False: In this scenario, the person-version interaction is completely confounded with error, so the only way to carry out statistical tests of the version effect and the person effect is to exclude the interaction term from the model.

Questions 4 and 5: A researcher is interested in how distractions affect students as they work on mathematical tasks, so she plans to assign 150 students to work under different conditions (with or without distractions) for a fixed period of time, recording the number of math problems that each student answers correctly. The study will be carried out over the course of a week, and the researcher worries that student motivation may vary day to day, so she decides to use day as a blocking variable. Each day (Monday through Friday), she will assign 15 students to work with distractions and 15 to work without distractions.

  1. Suppose the researcher wants to test for a condition-day interaction as part of the analysis. Fill in the degrees of freedom in the ANOVA table below.

Source

DF

SS

MS

F

Distraction Condition

?

Day

?

Condition Day

?

Error

?

Total

?

  1. True or False: In this scenario, the condition-day interaction is completely confounded with error, so the only way to carry out statistical tests of the condition effect and the day effect is to exclude the interaction term from the model.
  2. Researchers used a paired design to investigate whether cell phone use impairs drivers’ reaction times. 64 students participated in a simulation of driving situations, pressing a brake button as soon as they saw a red light. A device recorded their reaction times (in milliseconds). Each student completed the simulation under two different conditions: once while talking on a cell phone and once while listening to music.

The study design described above requires researchers to assume that the effect of the driving condition (cell phone or music) is the same for every student. How could you modify the study design in order to estimate and test for a statistically significant interaction?

    1. Increase the sample size by recruiting more students to participate.
    2. Ask each student to complete the simulation more than once under each condition.
    3. Add a control condition where students are not distracted by cell phones or music.
    4. Collect data on a potential confounding variable (how confident students feel as drivers, how much sleep they got last night, etc.).
  1. Which of the following is the best description of replication?
    1. Replication means there are at least 20 observational units in the study (or more if the data distribution is not symmetrical).
    2. Replication means the study employs random assignment and a placebo control group for comparison.
    3. Replication means each observational unit is measured more than once under different experimental conditions.
    4. Replication means each set of experimental conditions (or factor-block combination) occurs more than once in the design.
  2. Which of the following is an advantage of replication in a study design?
    1. Replication reduces unexplained variation within groups, which leads to more powerful study designs.
    2. Replication makes more efficient use of resources, which reduces the cost and/or time required to complete a study.
    3. Replication makes it possible to estimate treatment and interaction effects separately from random error.
    4. Replication ensures that there is no association between experimental factors, which prevents confounding.

Questions 9 and 10: Ten male and ten female personnel officers were shown a front view photograph of a job applicant’s face and asked to rate the likely job success of the applicant on a scale of 0 to 20. Half of the officers in each gender were chosen at random to receive a version of the photograph in which the applicant made eye contact with the camera.

  1. How would you describe this design?
    1. Randomized complete block design with repeated measures
    2. Generalized block design with replication
    3. Full factorial design with independent groups
    4. Within-block factorial design with two experimental factors
  2. The plot below shows the mean success rating for each combination of gender and eye contact. Is there a substantial interaction between gender and eye contact in this sample?

An interaction plot describes the interaction between gender and eye contact. The horizontal axis has the following markings: No and Yes, in the order form left to right. The vertical axis is labeled Mean success rating and has markings from 5.0 to 20.0 in increments of 2.5. A red line denotes Female and a blue line denotes male. The dots plotted are as follows: For no: a blue dot is at 9 and a red dot is at 13.0. For yes: a blue dot is at 12.5 and a red dot is at 17. A blue line increasing upward to the right connects the blue dots and a red line increasing upward to the right connects the red dots. All values are approximate.

    1. Yes, because the means for all four combinations of gender and eye contact are different from each other.
    2. Yes, because female offers tend to give higher ratings and officers who see the photo with eye contact tend to give higher ratings.
    3. No, because the effect of eye contact is roughly the same regardless of officer gender.
    4. No, because male and female officers are equally likely to see a photo with eye contact, since this is a balanced design.

Questions 11 through 14: A waitress works part-time at two different restaurants: one is a casual deli and the other is a more upscale Italian restaurant. She decides to conduct an experiment to investigate whether having a conversation with her customers or writing “Thank you!” on the check will affect her tip percentages. At each restaurant, she will assign 32 tables of customers to one of four treatments: conversation and message on check, no conversation and message on check, conversation and no message on check, or no conversation and no message on check.

  1. Identify the components of this study.

Observational unit(s): A. Conversation

Blocking variable(s): B. Message on check

Experimental factor(s): C. Restaurant

Response variable(s): D. Tables of customers

E. Tip percentage

  1. The graphs below show the interaction between conversation and message for each of the two restaurants. Do these graphs indicate a three-way interaction between restaurant, conversation, and message in this sample?

An interaction plot describes the interaction between conversation and message at an Italian restaurant on the mean tip percentage. The horizontal axis has the following markings: No and Yes, in the order form left to right. The vertical axis is labeled Mean tip percentage and has markings from 12.5 to 27.5 in increments of 2.5. A blue line denotes yes and a red line denotes no. The dots plotted are as follows: For no: a red dot is at 17.6 and a blue dot is at 22.0. For yes: a red dot is at 18.0 and a blue dot is at 24.5. A red line increasing to the right connect the red dots and a blue line increasing upward to the right connect the blue dots. All values are approximate.

An interaction plot describes the interaction between conversation and message at a deli restaurant on the mean tip percentage. The horizontal axis is labeled Message and has the following markings: No and Yes, in the order form left to right. The vertical axis is labeled Mean tip percentage and has markings from 12.5 to 27.5 in increments of 2.5. A blue line denotes yes and a red line denotes no. The dots plotted are as follows: For no: a red dot is at 15.5 and a blue dot is at 17.5. For yes: a red dot is at 16.25 and a blue dot is at 21.25. A red line increasing upward to the right connect the red dots and a blue line increasing upward to the right connect the blue dots. All values are approximate.

    1. Yes, because although the shapes of the graphs are similar, the tips at the Italian restaurant tend to be higher than tips at the deli.
    2. Yes, because having a conversation with the customers and writing a message on the check tend to have a positive effect on the tip percentage at both restaurants.
    3. No, the message effect is larger when the waitress has a conversation with the customers, and this interaction is roughly the same at both restaurants.
    4. No, because the lines do not intersect on either of these graphs, and we are not given a graph that shows all three variables.
  1. The p-value for the conversation-message interaction is 0.0273. After adjusting for restaurant, does this data provide sufficient evidence of an interaction between conversation and message? You may use .
    1. No, at the significance level, this study provides insufficient evidence of an interaction between conversation and message.
    2. Yes, at the significance level, this study provides sufficient evidence of an interaction between conversation and message.
    3. It is not appropriate to interpret the theory-based p-value for testing interaction, because the study design lacks replication.
    4. It is not appropriate to interpret the theory-based p-value for testing interaction, because the customers were not randomly assigned to restaurants.
  2. The p-value for the conversation-message interaction is 0.0273. Is it appropriate to interpret the main effects of conversation and message separately?
    1. Yes, when the p-value for an interaction is small, it is appropriate to interpret the main effects separately.
    2. No, when the p-value for an interaction is small, it is not appropriate to interpret the main effects separately.
    3. It depends on the p-value for the three-way interaction. As long as the p-value for the three-way interaction is small, it is appropriate to interpret the main effect separately.
    4. It depends on the p-value for the three-way interaction. As long as the p-value for the three-way interaction is large, it is appropriate to interpret the main effect separately.
  3. True or False: A matched pairs design is a special case of a within-blocks factorial design with one binary factor.

Section 3.4: Interactions in Observational Studies

LO3.4-1: Interpret interactions with observational data.

LO3.4-2: Sketch and interpret interaction plots.

LO3.4-3: Run a two-variable ANOVA including an interaction term for an observational study with 2 two-level variables.

Questions 1 through 6: In 2018, a sample of 628 academic faculty from universities across the country were surveyed about their salaries (in US dollars). The results were classified according to each faculty member’s academic rank (instructor, assistant professor, associate professor, and full professor) and gender (male, female).

  1. The plot below shows the interaction between rank and gender. Which of the following are statements about main effects and which are statements about interaction?

An interaction plot describes the interaction between rank and gender. The horizontal axis has the following markings: Female and Male, in the order form left to right. The vertical axis is labeled Mean salary (in 1000 dollars) and has markings from 40 to 180 in increments of 20. A red line denotes Instructor, a green line denotes Assistant, a blue line denotes Associate, and a brown line denotes professor. The dots plotted are as follows: For female: a red dot is at 60, a green dot is at 87, a blue dot is at 100, and a brown dot is at 130. For male: a red dot is at 62, a green dot is at 92, a blue dot is at 100, and a brown dot is at 158. A red line increasing to the right connects the red dots, a green line increasing toward the right connects the green dots, a blue line runs horizontally towards right connects the blue dots, and a brown line increasing upward to the right connects the brown dots. All values are approximate.

Statements about main effects:

Statements about interaction:

    1. At each academic rank, male faculty earn higher salaries than female faculty, on average.
    2. The salary “gap” between male and female faculty is highest at the academic rank of professor.
    3. Gender modifies the association between rank and salary.
  1. Which of the following is the appropriate null hypothesis for testing the statistical significance of the interaction between rank and gender?
    1. There is no interaction between rank and gender on salaries in this population.
    2. There is an interaction between rank and gender on salaries in this population.
    3. Neither rank nor gender has an interaction with salary in this population.
    4. Both rank and gender have an interaction with salary in this population.
  2. The partially filled-in ANOVA table below can be used to test the statistical significance of the interaction between rank and gender. Calculate the F-statistic for the interaction.

Source

DF

SS

MS

F

p-value

Rank

390,047

J;

<0.0001

Gender

11,767

0.0065

Rank Gender

13,133

0.0404

Error

976,624

Total

1,674,691

Sol:

  1. What would happen if the rank-gender interaction term were removed from the model?

Source

DF

SS

MS

F

p-value

Rank

390,047

J;

<0.0001

Gender

11,767

0.0065

Rank Gender

13,133

0.0404

Error

976,624

Total

1,674,691

    1. SSTotal would increase to 1,674,691 + 13,133 = 1,687,824.
    2. SSTotal would increase, but we can’t say what the new value would be, because of potential covariation between the interaction and the main effects.
    3. SSError would increase to 976,624 + 13,133 = 989,757.
    4. SSError would increase, but we can’t say what the new value would be, because of potential covariation between the interaction and the main effects.
  1. The table below shows 95% confidence intervals for the difference in mean salaries for male and female faculty members at each academic rank.

Male – Female

95% confidence interval

Instructor

(-13955, 23165)

Assistant professor

(-6734, 17991)

Associate professor

(-10345, 16890)

Professor

(14101, 40181)

At which academic rank(s) are the differences in mean salaries for male and female faculty members statistically significant? Select all that apply. Note: This question refers to statistical significance not practical importance.

    1. Instructor
    2. Assistant professor
    3. Associate professor
    4. Professor
  1. Does the residual plot below indicate any violation of conditions that might cause us to question the validity of theory-based p-values or confidence intervals in this context?

A dotplot describes the residual plots for the fitted values. The horizontal axis labeled Fitted values and ranges from 50 to 200 in increments of 50. The vertical axis is labeled Residuals and has markings from negative 150 to 150 in increments of 50. A horizontal line starts from 0 on the vertical axis and extends toward right along the length of the horizontal axis. The dots are plotted vertically for certain markings at horizontal axis. For fitted value 60, a series of overlapping dots are plotted from negative 25 to 25. For fitted value 62.5, a series of overlapping dots are plotted from negative 37.5 to 37.5. For fitted value 87.5, a series of overlapping dots are plotted from negative 25 to 27. For fitted value 90, a series of overlapping dots are plotted from negative 30 to 30. For fitted value 105, a series of overlapping dots are plotted from negative 45 to 48. For fitted value 106, a series of overlapping dots are plotted from negative 65 to 75. For fitted value 130, a series of overlapping dots are plotted from negative 98 to 98. For fitted value 160, a series of overlapping dots are plotted from negative 130 to 155. All values are approximate.

    1. Yes, the independence condition is violated.
    2. Yes, the equal variance condition is violated.
    3. Yes, the normality condition is violated.
    4. No, this graph does not indicate any major violation of the validity conditions.

Questions 7 through 11: Students in a statistics class wanted to know how customers rate one of their favorite coffee shops. The students administered surveys and asked customers to rate their experience in the shop on a scale of 1-10. They also asked whether the survey respondent was a college student and how often they visit coffee shops (at least once a week or less than once a week).

  1. How would you describe this design?
    1. Observational study with two explanatory variables
    2. Observational study with four explanatory variables
    3. Randomized complete block design with repeated measures
    4. Generalized block design with replication
  2. What can you say about the relationships among the variables based on the plot below? Select all that apply.

An interaction plot describes the interaction between frequency and Student. The horizontal axis has the following markings: No and Yes, in the order form left to right. The vertical axis is labeled Mean rating and has markings from 7.0 to 10.0 in increments of 0.5. A red line denotes At least once per week and a blue line denotes Less than once per week. The dots plotted are as follows: For no: a blue dot is at 7.8 and a red dot is at 8.6. For yes: a blue dot is at 8.25 and a red dot is at 9.1. A blue line increasing upward to the right connects the blue dots and a red line increasing upward to the right connects the red dots. All values are approximate.

    1. There is a substantial association between student and frequency.
    2. There is a substantial interaction between student and frequency.
    3. The student variable has a main effect on ratings.
    4. The frequency variable has a main effect on ratings.
  1. Fill in the blank with a numerical value:

In an ANOVA table, the interaction term would have ___ degree(s) of freedom.

  1. Using the adjusted sums of squares below, calculate for this model.

Source

Adjusted SS

Frequency

6.954

Student

1.879

Frequency Student

0.005

Error

128.290

Total

141.556

Sol:

  1. Suppose the interaction term were removed from the model. How would the R2 value change?

Source

Adjusted SS

Frequency

6.954

Student

1.879

Frequency Student

0.005

Error

128.290

Total

141.556

    1. Removing the interaction term would cause R2 to increase dramatically.
    2. Removing the interaction term would cause R2 to decrease dramatically.
    3. Removing the interaction term would cause R2 to increase slightly.
    4. Removing the interaction term would cause R2 to decrease slightly.

Questions 12 and 13: In an observational study of rheumatoid arthritis, researchers recorded several indicators of disease activity as well as information about treatment. The table below shows the mean CDAI value (an indicator of disease activity) for patients receiving different types of treatments.

N

Steroids

Biologics

Mean CDAI

71

No

No

10.51

76

No

Yes

10.78

31

Yes

No

15.94

28

Yes

Yes

22.82

  1. Does this table of means suggest a substantial interaction between treatment with steroids and treatment with biologics?
    1. Yes, because the sample sizes for the four groups are all different.
    2. Yes, because treatment with steroids and treatment with biologics are both associated with higher CDAI values.
    3. Yes, because the effect of biologics is much larger for patients who are taking steroid treatment than for those who are not taking steroid treatment.
    4. No, because 64.5% of patients who take steroids take biologics and 62.2% of patients who don’t take steroids take biologics, so the relationship is weak.
  2. Fill in the blanks:

It is plausible that patients who are being treated more aggressively had more severe symptoms before treatment began; thus the higher CDAI scores among patients taking steroids and biologics could be due to _________ (confounding/covariation). In order to draw causal conclusions, we would need a study that involves random __________ (selection/assignment).

  1. Suppose a study is done where the response variable is fuel efficiency (miles per gallon) and the two explanatory variables are horsepower (low, medium, high) and weight (heavy, light). A table with two of the interaction effects filled in is given below.

Low Horsepower

Medium Horsepower

High Horsepower

Heavy

0.23

2.97

Light

Fill in the missing interaction effects.

  1. A real estate agent collected information on 100 recent home sales in their town. In addition to selling prices (in $1000s), the agent recorded information about the home’s location (north side of town or south side of town) and the number of bedrooms (at least 3 or less than 3). Match the terms with their descriptions in this context.

Covariation: A. Houses on the south side of town are more likely to have at least three bedrooms than houses on the north side.

Association: B. The relationship between number of bedrooms and price changes based on the home’s location.

Interaction: C. Some of the variability in prices cannot be attributed exclusively to one source (location, bedrooms, or interaction).

Document Information

Document Type:
DOCX
Chapter Number:
3
Created Date:
Aug 21, 2025
Chapter Name:
Chapter 3 Intermediate Statistical Investigations Test Bank
Author:
Nathan Tintle

Connected Book

Intermediate Statistical Investigations 1st Ed - Exam Bank

By Nathan Tintle

Test Bank General
View Product →

$24.99

100% satisfaction guarantee

Buy Full Test Bank

Benefits

Immediately available after payment
Answers are available after payment
ZIP file includes all related files
Files are in Word format (DOCX)
Check the description to see the contents of each ZIP file
We do not share your information with any third party