Preliminaries

Intermediate Statistical Investigations Test Bank

Question types: FIB = Fill in the blank Calc = Calculation

Ma = Matching MS = Multiple select

MC = Multiple choice TF = True-false

PRELIMINARIES TERMINAL LEARNING OUTCOMES

TLOP-A: Explore how the relationship between two variables can be impacted by additional variables.

TLOP-B: Express the relationship between two variables using a statistical model, and calculate and report the standard error of the residuals, a measure of prediction error.

PRELIMINARIES ENABLING LEARNING OUTCOMES

LOP-1: Identify and apply basic terminology of statistical studies: observational units, response variable, explanatory variable, association, confounding variable.

LOP-2: Identify potential sources and measures of variation in a response variable.

LOP-3: Produce and describe some basic visualizations and numerical summaries to compare groups and explore relationships (e.g., bar graphs, dotplots/histograms/boxplots, scatterplots, means, medians, standard deviation).

LOP-4: Explore how those comparisons and relationships can be impacted by additional variables.

LOP-5: Calculate a residual and relate it to typical prediction error.

Section P.A

Questions 1 and 2: In order for students to participate in remote learning, they need access to a computer or tablet for school work. An advocacy group conducted a survey of U.S. families and analyzed the relationship between household income (less than $25,000, between $25,000 and $50,000, or greater than $50,000) and access to a computer or tablet (no access, sometimes, always).

Access to a computer/tablet is the __________ (explanatory/response) variable.

We can classify this variable as __________ (categorical/quantitative).

Income is the __________ (explanatory/response) variable.

We can classify this variable as __________ (categorical/quantitative).

Which type of graph is most appropriate for displaying the relationship between income and access to a computer/tablet?
1. A mosaic plot
2. A histogram
3. Stacked boxplots (one boxplot for each group)
4. A scatterplot
Which of the hypothetical mosaic plots below shows an association between class (first, second, third, or crew) and survival on the Titanic? Select all that apply.

1. Graph A
2. Graph B
3. Graph C

Questions 4 through 7: Two hospitals, Memorial Hospital and Fairbanks Medical Center, both perform the same procedure to alleviate joint pain. Researchers surveyed a random sample of 100 patients from each hospital who had undergone this procedure and asked whether or not they had made a full recovery in a six month period after surgery. The table below shows the results of the survey. The mosaic plot further breaks down the results based on socioeconomic status (SES).

	Recovery
Hospital	Full	No	Total
Fairbanks	60	40	100
Memorial	66	34	100
Total	126	74	200

Based on the table, is there an association between hospital and recovery in this sample? Note: This question refers to the sample, so there is no need to consider whether this data reflects a genuine tendency in the population.
1. There is no association. More than 50% of patients recover, regardless of hospital.
2. There is no association. The recovery rate is not the same at the two hospitals.
3. There is an association. More than 50% of patients recover, regardless of hospital.
4. There is an association. The recovery rate is not the same at the two hospitals.
Suppose you are primarily interested in the potential association between hospital and recovery. Based on the mosaic plot, does SES satisfy the definition of a confounding variable in this scenario?

_______ (Yes/No), because Memorial Hospital is more likely to serve _______ (high/low) SES patients, compared to Fairbanks, and_______ (high/low) SES patients are more likely to make a full recovery.

Does this scenario satisfy the definition of Simpson’s paradox?

_______ (Yes/No), because overall, the recovery rates are higher at _______ (Memorial/Fairbanks), but after adjusting for SES, the conditional recovery rates are higher at _______ (Memorial/Fairbanks).

Put the components of the study into the correct boxes in the Sources of Variation Diagram. Note: Some boxes will include more than one answer.

Observed variation in:	Sources of explained variation	Sources of unexplained variation
Inclusion criteria

Hospital (Memorial or Fairbanks)
Has undergone medical procedure for joint pain
Lifestyle factors (physicality of occupation, opportunities for rest, etc.)
Recovery (full or no)
Socioeconomic status (low SES or high SES)
Physical therapy (as recommended, less than recommended, none)

Questions 8 through 13: In 2018, a sample of academic faculty from universities across the country were surveyed about their salaries (in US dollars). The results were classified according to each faculty member’s academic rank (instructor, assistant professor, associate professor, and full professor) and gender (male, female).

Identify the observational units and variables. Note: Some blanks will include more than one answer. One of the answer choices will not be used.

Observational units: A. Academic faculty members

Explanatory variables: B. Universities

Response variable: C. Salary (in US dollars)

D. Academic rank

E. Gender

Name the graph types being used to display the distribution of salaries.

1. Bar graph
2. Mosaic plot
3. Histogram
4. Boxplot
5. Scatterplot

Describe the relationship between the mean salary and the median salary.

The mean salary is ______ (greater/less) than the median salary, because the distribution of salaries is skewed ______ (right, left)

True or False: There is an association between gender and salary in this sample.

Use the mosaic plot to match the percentages below to the appropriate statement describing the data.

Statement 1: _____% of the faculty in this sample are male.

Statement 2: _____ % of the faculty who have reached the rank of professor are male.

Statement 3: _____% of female faculty members are at the rank of instructor.

Statement 4: _____% of male faculty members are at the rank of instructor.

7%
22%
68%
81%
Using the tables below, compare the distribution of salaries for male and female faculty members, before and after taking rank into account.

A table titled, salary has two rows and five columns and the column headers are: gender, Mean, Median, Std Dev, and I Q R. The data from the table reads: Row 1: female: mean, 95670.78; median, 89004.99; std Dev, 36198.22; I Q R, 40259.92. Row 2: male: mean, 123953.16; median, 107470.67; std Dev, 55281.05; I Q R, 60642.28.

Which of the following statements are true? Select all that apply.

1. Before taking rank into account, there is an association between gender and salary in this sample.
2. After taking rank into account, there is an association between gender and salary in this sample.
3. The nature of the relationship between gender and salary changes after adjusting for rank; thus rank satisfies the definition of a confounding variable in this scenario.
4. The direction of the relationship between gender and salary changes after adjusting for rank; thus this scenario satisfies the definition of Simpson’s paradox.

Section P.B

Questions 14 through 16: A sample of caregivers was selected to be representative of the U.S. population. If there was more than one child in the caregiver’s household, the child asked about was determined randomly. Each caregiver was asked to estimate how much sleep their child typically gets at night. To predict sleep times, we can use the following statistical model:

Model 1: Predicted sleep = 8.26 hours, standard deviation = 1.33 hours

Interpret the standard deviation.
1. The range of the middle 50% of the data is 1.33 hours.
2. A typical sleep time lies about 1.33 hours from the mean sleep time.
3. 95% of the observations in this sample are in the range 8.26 1.33.
4. 95% of the observations in this sample are higher than 1.33 hours.
One caregiver estimates that their child gets 7 hours of sleep per night. Calculate the residual for this observation.

Solution: 7 – 8.26 = -1.26

Suppose we add a new variable, Age, which explains a substantial amount of the variability in sleep times.

Model 2:

Predicted sleep = 7.63 hours for older children (age 12-17), 8.89 hours for younger children (age 6-11)

SE of the residuals = ?

Since Age explains a substantial amount of the variability in sleep times, we would expect the standard error of the residuals in Model 2 to be _______ (greater than/less than/equal to) the standard deviation in Model 1.

Questions 17 and 18: For a sample of 100 colleges, four different models were used to predict the average salary in the year after graduating. The graphs below show the residuals from these models.

Graph 1 shows the residuals from a model that uses a school’s admittance rate to predict average salary. Graph 2 shows the residuals from a model that uses a school’s total cost per year (out-of-state) to predict average salary. Which explanatory variable do you prefer?
1. Admittance rate (Model 1), because more of the variability in average salaries is explained by the model.
2. Total cost (Model 2), because there is less unexplained variability in average salaries after applying the model.
3. Both models are equally good, because both have a distribution of residuals that is roughly bell-shaped.
4. Neither of these models is useful for predicting average salary, because both have a distribution of residuals that is centered at 0.
Suppose we used a model with two explanatory variables. What would the graph of the residuals look like if both admittance rate and cost were used to predict average salary?
1. Graph 3 is reasonable, because the SD of the residuals is less than 5768.
2. Graph 4 is a reasonable, because the SD of the residuals is greater than 6329.
3. Neither Graph 3 nor Graph 4 is reasonable, because the SD of the residuals should be between 5768 and 6329.

Questions 19 and 20: A botanist collected data to measure the variation of Iris flowers of two related species. The data set consists of 50 flowers from two species of Iris – Iris setosa and Iris versicolor. Various features were measured including the length and width of the sepals, in centimeters. (Sepals are a part of the flower.)

The graphs below show the relationship between sepal length and sepal width. Graph A shows the line of best fit for the full sample. Graph B shows a separate line of fit for each species: red represents setosa and blue represents versicolor

Consider two models for predicting sepal width. Note that Model 2 accounts for Species. Some values are intentionally missing in Model 2.

Model 1: Predicted sepal width = 3.94 – 0.15(Sepal length), SE of residuals = 0.47

Model 2: Predicted sepal width = , SE of residuals =

The letter B is a placeholder for a missing value in Model 2. Would you expect this value to be positive, negative, or close to 0?
1. Positive
2. Negative
3. Close to 0
We would expect the SE of the residuals for Model 2 to be _______ (>, <, =) the SE of the residuals for Model 1.

Document Information

Document Type:

DOCX

Chapter Number:

All in one

Created Date:

Aug 21, 2025

Chapter Name:

Preliminaries Test Bank 1e

Author:

Nathan Tintle

Connected Book

Intermediate Statistical Investigations 1st Ed - Exam Bank

By Nathan Tintle

Test Bank General

View Product →

Explore recommendations drawn directly from what you're reading

Chapter 5 Intermediate Statistical Investigations Test Bank

DOCX Ch. 5

Chapter 6 Intermediate Statistical Investigations Test Bank

DOCX Ch. 6

Preliminaries Test Bank 1e

DOCX Ch. All in one Current

Tintle Full Test Bank Preliminaries Test Bank 1e

PRELIMINARIES TERMINAL LEARNING OUTCOMES

PRELIMINARIES ENABLING LEARNING OUTCOMES

Section P.A

Section P.B

Document Information

Connected Book

Intermediate Statistical Investigations 1st Ed - Exam Bank

Explore recommendations drawn directly from what you're reading

$24.99

Quick Navigation

Benefits

Tintle Full Test Bank Preliminaries Test Bank 1e

PRELIMINARIES TERMINAL LEARNING OUTCOMES

PRELIMINARIES ENABLING LEARNING OUTCOMES

Section P.A

Section P.B

Document Information

Connected Book

Intermediate Statistical Investigations 1st Ed - Exam Bank

Explore recommendations drawn directly from what you're reading

$24.99

Quick Navigation

Benefits

Report Unauthorized Use

Added to Cart