Chapter 10 Two Quantitative Variables Exam Questions - Test Bank + Answers | Statistical Investigations 2e by Nathan Tintle. DOCX document preview.
Chapter 10
Introduction to Statistical Investigations Test Bank
Note: TE = Text entry TE-N = Text entry - Numeric
Ma = Matching MS = Multiple select
MC = Multiple choice TF = True-False
DD = Drop-down
CHAPTER 10 LEARNING OBJECTIVES
10-1: Explore scatterplots and comment on the direction, form, and strength of the association between two quantitative variables.
10-2: Carry out a simulation-based analysis for a correlation coefficient.
10-3: Understand how to find the least squares regression line and what the slope of this line conveys.
10-4: Carry out a simulation-based analysis for a slope coefficient.
10-5: Carry out a theory-based analysis for a slope coefficient.
Section 10.1: Two Quantitative Variables: Scatterplots and Correlation
10.1-1: Recognize that a scatterplot is the appropriate graph for displaying the relationship between two quantitative variables, and create a scatterplot from raw data.
10.1-2: Summarize the characteristics of a scatterplot by describing its direction, form, and strength, as well as whether there are any unusual observations.
10.1-3: Recognize that a correlation coefficient of 0 means that there is no linear association between the two variables and that a correlation coefficient of –1 or 1 means that the scatterplot is exactly a straight line.
10.1-4: Estimate the value of the correlation coefficient within ± 0.3 by looking at a scatterplot.
10.1-5: Recognize that the correlation coefficient is appropriate only for summarizing the strength and direction of a scatterplot that has linear form.
10.1-6: Understand that the correlation coefficient is not resistant to extreme observations.
10.1-7: Recognize how the association between two variables may change when data are split into smaller groups.
Questions 1 through 6: Babies born with low birth weights (less than 2500 grams) are at an increased risk for many infant diseases. Researchers in North Carolina collected data to see what variables may influence the birth weight (in grams) of a child, including whether the mother drank alcohol during pregnancy, whether the mother smoked during pregnancy, the mother’s age (years), the gestation of the pregnancy (number of weeks from conception until birth), the mother’s race, the length of the birth (hours), and several others. The plot below summarizes some of the variables measured.
- For each of the three variables displayed in the plot, state whether they are categorical or quantitative.
Weight:
Gestation:
Smoke:
Weight: quantitative
Gestation: quantitative
Smoke: categorical
LO: 10.1-1; Difficulty: Easy; Type: TE
- Based on the plot, does there appear to be an association between gestation period and birth weight?
- Yes, because as one variable increases, the other tends to increase as well.
- No, because as one variable increases, the other tends to increase as well.
- Yes, because most of the circles on the plot appear to be higher than the triangles.
- No, because one variable does not appear to change the other variable.
- Is whether the mother smoked during pregnancy a confounding variable in describing the relationship between gestation period and birth weight?
- No, because smoking status is not associated with gestation period.
- No, because smoking status is not associated with birth weight.
- Yes, because mothers who smoke tend to have lighter babies, and mothers who smoke also tend to have shorter gestation periods.
- Yes, because mothers who smoke tend to have heavier babies, and mothers who smoke also tend to have longer gestation periods.
- We cannot determine whether smoking status is a confounding variable based on this plot.
- Does the association between gestation period and birthweight appear to depend on smoking status?
- Yes, since the regression line for non-smokers is higher than the regression line for smokers.
- Yes, since the regression line for non-smokers would have a higher y-intercept than the regression line for smokers.
- No, since the regression line for non-smokers and the regression line for smokers have similar slopes.
- No, since the sample sizes of smokers and non-smokers are similar.
- How would the correlation coefficient between birth weight and gestation period computed from all women in the sample compare to the correlation coefficient between birth weight and gestation period computed from only non-smoking women?
- The correlation coefficient for the entire sample would be closer to 1 than the correlation coefficient for the non-smoking group.
- The correlation coefficient for the entire sample would be closer to 0 than the correlation coefficient for the non-smoking group.
- The correlation coefficient for the entire sample would be the same as the correlation coefficient for the non-smoking group.
- We cannot determine how the two correlation coefficients compare based on this plot.
- What type of plot would be appropriate for examining the relationship between a mother’s age and her baby’s birth weight?
- Scatterplot
- Side-by-side boxplots
- Segmented bar graph
- Dotplot
Questions 7 through 10: The following scatterplot displays the finish time (in minutes) and age (in years) for the male racers at the 2018 Strawberry Stampede (a 10k race through Arroyo Grande).
- What is the form of this scatterplot?
- Linear
- Non-linear
- What is the direction of the association between finish time and age?
- Positive
- Negative
- Approximate the value of the correlation coefficient for these data.
- 0
- 0.25
- 0.50
- 0.80
- If 70-year-old male with a finishing time of 35 minutes was added to the data set, would the correlation coefficient increase, decrease, or remain the same?
- Increase
- Decrease
- Remain the same
- Unable to determine with the information provided
- Which of the following plots has the strongest correlation between the two variables plotted?
A. B. C. D.
- Estimate the value of the correlation coefficient between the two variables shown in the following scatterplot.
- 0.90
- 0.80
- -0.80
- 0.09
- True or False: If the correlation coefficient between variables x and y is equal to zero, then we can say that x and y are not associated.
- The graph below shows a scatter plot of medical expenses in the past year by age for a sample of Americans.
Which one of the following is a true statement about the data shown in the graph?
- The correlation must be close to one because there is a strong relationship between age and medical expenses.
- Using correlation on the data shown in the graph above is not appropriate because the relationship shown in the graph is not linear.
- Both A and B are true statements.
- Neither A nor B is a true statement.
- Which of the following correlation coefficient values describes the weakest linear association between two variables.
- -0.99
- -0.23
- 0.12
- 0.38
Section 10.2: Inference for the Correlation Coefficient: Simulation-Based Approach
10.2-1: Apply the 3S strategy when evaluating the hypothesis of linear association using the correlation coefficient as the statistic.
10.2-2: Articulate how to conduct a tactile simulation to implement the 3S strategy for testing a correlation coefficient.
10.2-3: Define the p-value in the context of the 3S strategy using simulated correlation coefficients under the null hypothesis of no association.
Questions 16 through 20: Are people with bigger brains more intelligent? Forty college students volunteered to participate in a study which examined brain size (measured as 1000’s of pixels counted in a brain scan), and IQ scores (measured in points). A scatterplot of the data is shown below.
- Approximate the value of the correlation coefficient for these data.
- 0.10
- 0.40
- 0.80
- 0
- State the null and alternative hypotheses using proper notation.
versus
versus
versus
versus
- Select the best explanation for how one sample would be simulated in order to generate the null distribution.
- Flip a coin to decide whether to swap the values for brain size and IQ or not. Plot the correlation coefficient of the randomized points on the null distribution.
- Holding the order of brain size values constant, randomize the order of the IQs. Plot the correlation coefficient of the shuffled data on the null distribution.
- Put each pair of (brain size, IQ) on a piece of paper. Draw with replacement 40 times. Plot the correlation coefficient of the resampled data on the null distribution.
- Add or subtract the appropriate value from each brain size and IQ in order to force the null hypothesis to be true. Plot the correlation coefficient of the shifted data on the null distribution.
- Below is a picture of a simulated null distribution of correlation coefficients created using the Corr/Regression applet. How would you use this distribution to calculate the p-value?
- Find the proportion of simulated correlation coefficients greater than zero.
- Find the proportion of simulated correlation coefficients as far away from zero or further than the one observed.
- Find the proportion of simulated correlation coefficients as small or smaller than the one observed.
- Find the proportion of simulated correlation coefficients as large or larger than the one observed.
- The p-value for this test is 0.008. What can we conclude?
- We have strong evidence that an increase in brain size will increase IQ.
- We have strong evidence that an increase in brain size is associated with an increase in IQ.
- We have strong evidence that an increase in IQ will increase brain size.
- We have strong evidence that brain size and IQ are not associated.
Questions 21 through 26: Data from gapminder.org on 184 countries was used to examine if there is an association between (average) female life expectancy (that is, the average lifespan of women in the country) and the average number of children women give birth to for the year 2019. A scatterplot of the data follows.
- What are the observational units?
- Women
- Babies
- Countries
- Years
- Approximate the correlation coefficient for these data.
- State the null and alternative hypotheses using proper notation.
versus
versus
versus
versus
- Select the best explanation for how one sample would be simulated in order to generate the null distribution.
- Holding the average number of children constant, randomize the order of the female life expectancies. Plot the correlation coefficient of the shuffled data on the null distribution.
- Add or subtract the appropriate value from each average number of children and female life expectancy in order to force the null hypothesis to be true. Plot the correlation coefficient of the shifted data on the null distribution.
- Flip a coin to decide whether to swap the values for average number of children and female life expectancy or not. Plot the correlation coefficient of the randomized points on the null distribution.
- Put each pair of (average number of children, female life expectancy) on a piece of paper. Draw with replacement 40 times. Plot the correlation coefficient of the resampled data on the null distribution.
- Below is a picture of a simulated null distribution of correlation coefficients created using the Corr/Regression applet. How would you use this distribution to calculate the p-value?
- Find the proportion of simulated correlation coefficients greater than zero.
- Find the proportion of simulated correlation coefficients as far away from zero or further than the one observed.
- Find the proportion of simulated correlation coefficients as small or smaller than the one observed.
- Find the proportion of simulated correlation coefficients as large or larger than the one observed.
- The p-value for this test is less than 0.001. What can we conclude?
- We have strong evidence that average number of children per woman is associated with female life expectancy.
- We have strong evidence that an increase in average number of children per woman will decrease female life expectancy.
- We have strong evidence that an increase in female life expectancy will decrease the average number of children per woman.
- We have strong evidence that average number of children per woman is not associated with female life expectancy.
Questions 27 through 30: The following scatterplot displays the finish time (in minutes) and age (in years) for the male racers at the 2018 Strawberry Stampede (a 10k race through Arroyo Grande).
Below are the same data for the female racers in this year’s race.
- Do you think the correlation coefficient for the females will be larger, smaller, or remain the same as the male’s?
- Larger
- Smaller
- Remain the same
- Select the best explanation for how one sample would be simulated in order to generate the null distribution for the females.
- Holding the ages constant, randomize the order of the race finish times. Plot the correlation coefficient of the shuffled data on the null distribution.
- Add or subtract the appropriate value from each age and race finish time in order to force the null hypothesis to be true. Plot the correlation coefficient of the shifted data on the null distribution.
- Flip a coin to decide whether to swap the values for age and race finish time or not. Plot the correlation coefficient of the randomized points on the null distribution.
- Put each pair of (age, race finish time) on a piece of paper. Draw with replacement 40 times. Plot the correlation coefficient of the resampled data on the null distribution.
- What is the null hypothesis for a simulation-based test of the correlation coefficient for the females.
- There is an association between female ages and female race finish times.
- There is no association between female ages and female race finish times.
- The p-value for a simulation-based test of the correlation coefficient for the females is 0.213. True or False: We have evidence that there is no association between female ages and female race finish times.
Section 10.3: Least Squares Regression
10.3-1: Understand that one way a scatterplot can be summarized is by fitting the best-fit (least squares regression) line and interpreting both the slope and intercept of the best-fit line in the context of the two variables on the scatterplot.
10.3-2: Find the predicted value of the response variable for a given value of the explanatory variable.
10.3-3: Understand that slope = 0 means no association, slope < 0 means negative association, and slope > 0 means positive association. Further, the sign of the slope will be the same as the sign of the correlation coefficient.
10.3-4: Understand that extrapolation is use of a regression line to predict values outside of the range of observed values for the explanatory variable, including the special case of y = 0 when applicable.
10.3-5: Understand the concept of residual and find and interpret the residual for an observational unit, given the raw data and the equation of the best-fit (regression) line.
10.3-6: Understand the relationship between residuals and strength of association and that the best-fit (regression) line minimizes the sum of the squared residuals.
10.3-7: Find and interpret the coefficient of determination (R2) as the squared correlation and as the proportion of total variation in the response variable that is accounted for by changes (variation) in the explanatory variable.
10.3-8: Understand that influential points can substantially change the equation of the best-fit line and that observations with extreme values of the explanatory variable may potentially be influential.
- True or False: If you fit a least squares line to two quantitative variables x and y, and the slope of the line differs from zero, then you know the correlation coefficient also differs from zero.
Questions 32 through 38: Annual measurements of the number of powerboat registrations (in thousands) and the number of manatees killed by powerboats in Florida were collected over the 14 years 1977–1990. A scatterplot of the data, least squares regression line, and correlation coefficient follow.
Correlation:
r = 0.943
Regression line:
- How would you interpret the slope of the regression line in the context of the problem? Select all that apply.
- Every 8,000 powerboat registrations is associated with a predicted increase of one manatee death.
- We predict an additional 0.12 manatee death for each single powerboat registration.
- We predict an additional 0.12 manatee death for every 1,000 powerboats registered.
- We predict a decrease of 41.43 manatee deaths for every 1,000 powerboats registered.
- Fill in the blanks with the appropriate values to interpret the y-intercept:
We predict ___(1)____ manatee deaths when there are ____(2)____ powerboat registrations.
LO: 10.3-1; Difficulty: Medium; Type: TE-N
- The y-intercept is not a valid prediction of manatee deaths. Why?
- It is an example of extrapolation.
- We cannot observe a negative number of manatee deaths.
- Both A and B.
- Neither A nor B.
- What is the predicted number of manatee deaths for a year with 600,000 powerboat registrations?
LO: 10.3-2; Difficulty: Medium; Type: TE-N
- The year 1984 had 559,000 powerboat registrations and 34 manatee deaths. Calculate the residual for this observation.
LO: 10.3-5; Difficulty: Medium; Type: TE-N
- The year 1984 had 559,000 powerboat registrations and 34 manatee deaths. Did the least squares regression line underestimate, overestimate, or accurately estimate the number of manatee deaths for the year 1984?
- Underestimate
- Overestimate
- Accurately estimate
- Which of the following is a correct interpretation of the coefficient of determination?
- About 94.3% of the variation in manatee deaths can be explained by the number of powerboat registrations.
- About 88.9% of the variation in manatee deaths can be explained by the number of powerboat registrations.
- An increase of 1,000 powerboat registrations is associated with a predicted increase of 0.943 manatee deaths.
- An increase of 1,000 powerboat registrations is associated with a predicted increase of 0.889 manatee deaths.
Questions 39 through 41: Data from the World Bank for 25 Western Hemisphere countries was used to examine the association between (average) female life expectancy (that is, the average lifespan of women in the country) and the average number of children women give birth to. Given below is the scatterplot for the data.
The regression equation for this context is found to be:
where is female life expectancy in years, and
is the average number of births per woman.
- Interpret the slope in the context of the study.
- We expect to see an increase of 84.5 years in female life expectancy when the average number of births per woman in a country increases by one child.
- When a country has zero births per woman on average, we predict a female life expectancy of 84.5 years.
- We expect to see a decrease of 4.4 years in female life expectancy when the average number of births per woman in a country increases by one child.
- When a country has zero births per woman on average, we predict a female life expectancy of 4.4 years.
- Interpret the y-intercept in the context of the study.
- We expect to see an increase of 84.5 years in female life expectancy when the average number of births per woman in a country increases by one child.
- When a country has zero births per woman on average, we predict a female life expectancy of 84.5 years.
- We expect to see a decrease of 4.4 years in female life expectancy when the average number of births per woman in a country increases by one child.
- When a country has zero births per woman on average, we predict a female life expectancy of 4.4 years.
- Is the interpretation of the y-intercept meaningful in the context? Why?
- No, since it is an example of extrapolation.
- No, since the lowest value for average births per woman in the data set was 1.5.
- Both A and B.
- Neither A nor B.
- In the scatterplot shown, which labeled point has the largest residual?
A
- B
- C
- D
- E
LO: 10.3-5
Difficulty: Easy
Type: MC
- True or False: Observations with values of the explanatory variable near the mean of the explanatory variable may potentially be influential.
- True or False: The least squares regression line minimizes the absolute value of the residuals.
- True or False: The correlation coefficient is the proportion of total variation in the response variable that is accounted for by changes in the explanatory variable.
Section 10.4: Inference for the Regression Slope: Simulation-Based Approach
10.4-1: Apply the 3S strategy when evaluating the hypothesis of association using the slope as the statistic.
10.4-2: Articulate how to conduct a tactile simulation to implement the 3S strategy for testing a slope.
10.4-3: Define the p-value in the context of the 3S strategy using simulated slopes under the null hypothesis of no association.
10.4-4: Know that a test of association based on slope is equivalent to a test of association based on a correlation coefficient.
- For a given dataset, a test of association based on a slope is equivalent to a test of association based on a correlation coefficient. Being equivalent means which of the following is true?
- The confidence intervals for the population correlation and population slope will have the same center.
- The p-value will be the same whether you use correlation as the statistic or the slope of the regression line as the statistic.
- The observed correlation will be the same as the observed slope of the regression line.
- The confidence intervals for the population correlation and population slope will have the same width.
Questions 47 through 52: Social warmth is a term referring to the feeling of being connected to others. A study published in PLoS One in 2016 looked at a potential relationship between physical warmth (body temperature) and social warmth among a group of 54 volunteers (Inagki et al.). These volunteers had their oral temperature taken by a registered nurse and then assessed themselves using a scale of 1 to 5 on twelve items related to a feeling of social connection for which the average was recorded. Higher average scores indicated higher levels of social warmth. The theory was that the thermoregulatory system, which helps maintain a relatively warm internal body temperature, may also help people assess feelings of social connection. Below is a scatterplot and least-squares regression line of the data.
- How would you interpret the slope of the regression line in the context of the study?
- The correlation coefficient between body temperature and social warmth score is 0.461.
- The predicted social warmth score when body temperature is zero degrees Celsius is -12.773.
- A one degree Celsius increase in body temperature is associated with a predicted 0.461 increase in social warmth score.
- About 46.1% of variability in social warmth scores can be explained by body temperature.
- Which of the following is the correlation coefficient for these data?
- 0.348
- 0.721
- -0.032
- -0.213
- State the null and alternative hypotheses for a simulation-based test of the slope using proper notation.
versus
versus
versus
versus
- Select the best explanation for how one sample would be simulated in order to generate the null distribution.
- Holding the body temperatures constant, randomize the order of the social warmth scores. Plot the slope of the regression line of the shuffled data on the null distribution.
- Add or subtract the appropriate value from each body temperature and social warmth score in order to force the null hypothesis to be true. Plot the slope of the regression line of the shifted data on the null distribution.
- Flip a coin to decide whether to swap the values for body temperature and social warmth score or not. Plot the slope of the regression line of the randomized points on the null distribution.
- Put each pair of (body temperature, social warmth score) on a piece of paper. Draw with replacement 40 times. Plot the slope of the regression line of the resampled data on the null distribution.
- Below is a picture of a simulated null distribution of slopes created using the Corr/Regression applet. How is this distribution used calculate the p-value?
- Find the proportion of simulated slopes greater than zero.
- Find the proportion of simulated slopes as far away from zero or further than 0.461.
- Find the proportion of simulated slopes as small or smaller than 0.461.
- Find the proportion of simulated slopes as large or larger than 0.461.
- Based off of the simulated null distribution in question 51, what is the strength of evidence against the null hypothesis?
- We have strong evidence against the null hypothesis.
- We have moderate evidence against the null hypothesis.
- We have weak evidence against the null hypothesis.
- We have no evidence against the null hypothesis.
Questions 53 through 60: It is commonly expected that as a person ages, their muscle mass decreases. To further examine this relationship in women, a nutritionist randomly selected 60 female patients from her clinic, 15 women from each 10-year age group beginning with age 40 and ending with age 80. For each patient, her age and current muscle mass was recorded. A scatterplot, least squares regression line, and coefficient of determination are as follows.
- Write a sentence interpreting the value of the slope in the context of the study.
- A one year increase in age is associated with a 1.19 lb increase in predicted muscle mass.
- A one year increase in age is associated with a 1.19 lb decrease in predicted muscle mass.
- A one year increase in muscle mass is associated with a 1.19 lb increase in predicted age.
- A one year increase in muscle mass is associated with a 1.19 lb decrease in predicted age.
- Which of the following is a correct interpretation of the coefficient of determination?
- When the age of a woman is equal to zero, her predicted muscle mass is 75.01 lbs.
- The correlation coefficient between age and muscle mass is equal to 0.7501.
- Each additional year in age is associated with a 75.01% decrease in predicted muscle mass.
- Approximately 75.01% of the variation in muscle mass can be explained by changes in age among these women.
- What is the value of the correlation coefficient between age and muscle mass for these data?
- 0.7501
- -0.7501
- 0.8661
- -0.8661
- Write the null and alternative hypotheses of interest for testing if there is a negative linear relationship between age and muscle mass using proper notation for a test of slope.
versus
versus
versus
versus
- How would you simulate one sample, assuming the null hypothesis is true?
- Label cards with muscle mass values from the original data. Mix cards together; shuffle into two new groups of age 40 and age 80.
- Label cards with muscle mass values from the original data. Mix cards together, and deal one muscle mass value to each of the age values in the data.
- Flip a coin for each woman in the sample; if heads, swap the difference between age and muscle mass.
- Label cards with age values from the original data. Mix cards together; shuffle into two new groups of low and high muscle mass.
- The p-value for these data was less than 0.0001. Write a conclusion of the test in the context of the study.
- We have strong evidence that an increase in age causes a decrease in muscle mass.
- We have strong evidence that age is negatively correlated with muscle mass.
- We do not have strong evidence that an increase in age causes a decrease in muscle mass.
- We do not have strong evidence that age is negatively correlated with muscle mass.
- If you were to conduct a simulation-based test using the correlation coefficient as your statistic, would the p-value be larger, smaller, or remain the same as the p-value reported in question 58?
- Larger
- Smaller
- Remain the same
- Not enough information provided
- Can these results be generalized to the population of all patients at this clinic?
- Yes, since it was a random sample.
- No, since the sample size was too small.
- No, since only female patients were selected.
- No, since age was not randomly assigned to patients.
Section 10.5: Inference for the Regression Slope: Theory-Based Approach
10.5-1: Realize that both simulation-based approaches to testing correlation coefficients and slopes can, under certain conditions, be well predicted by the theory-based approach known as a t-test.
10.5-2: Evaluate a scatterplot for the two validity conditions for a theory-based test of correlation coefficients/slopes: symmetry and consistent variability around the regression line.
10.5-3: State hypotheses in terms of population slopes and correlations.
10.5-4: Interpret a confidence interval for the population slope.
Questions 61 through 68: Data from gapminder.org on 184 countries was used to examine if there is an association between (average) female life expectancy (that is, the average lifespan of women in the country) and the average number of children women give birth to for the year 2019. A scatterplot of the data and a regression table from the Corr/Regression applet follows.
Term | Coeff | SE | t-stat | p-value |
Intercept | 88.91 | 0.72 | 123.78 | <0.0001 |
Fertility | -5.20 | 0.24 | -21.60 | <0.0001 |
- Which of the following validity conditions does not need to be checked in order to conduct a theory-based test for a regression slope?
- The number of countries in the data set is larger than 20.
- The variability in female life expectancy around the regression line should be similar regardless of the value of average number of babies per woman.
- There is approximately the same distribution of points above the regression line as below the regression line.
- The general pattern of the points on the scatterplot has a linear trend.
- Which of the approaches to a test of the regression slope are valid?
- Simulation-based test
- Theory-based test
- Both A and B
- Neither A nor B
- State the null and alternative hypotheses to examine if there is an association between female life expectancy and the average number of children women give birth to for the year 2019.
versus
versus
versus
versus
- Both A and C
- Both B and D
- Using the regression table output, state the equation of the regression line.
- Using the regression table output, what is the standardized statistic for a test of the regression slope.
- 123.78
- -21.60
- 88.91
- -5.20
- How would you interpret the standardized statistic for a test of the regression slope?
- The observed sample slope of -5.20 is 21.6 standard errors below the hypothesized value of zero.
- The observed sample intercept of 88.91 is 21.6 standard errors below the hypothesized value of zero.
- Zero is 21.6 standard errors below the observed sample slope of -5.20.
- Zero is 21.6 standard errors below the observed sample intercept of 88.91.
- Is there significant evidence of an association between female life expectancy and the average number of children women give birth to for the year 2019?
- Yes, since the slope of the regression line is negative.
- Yes, since the intercept of the regression line is positive.
- Yes, since the correlation is negative.
- Yes, since the p-value for the intercept is less than 0.01.
- Yes, since the p-value for the slope is less than 0.01.
- A 95% confidence interval for the population slope is (-5.68, -4.73). How would you interpret this interval in the context of the study?
- If we repeated this study many times, 95% of the regression slopes would fall between -5.68 and -4.73.
- There is a 95% probability that the population slope is between -5.68 and -4.73.
- We are 95% confident that the population slope is between -5.68 and -4.73.
- A one baby increase in average number of babies per woman is associated with between a 4.73 and 5.68 year decrease in female life expectancy, with 95% confidence.
- Both A and B.
- Both B and C.
- Both C and D.
Questions 69 through 72: How is the number of pages in a textbook related to the price of the textbook? To find out, two Cal Poly freshmen (2006) randomly selected 30 textbooks at the campus bookstore and recorded the price ($) and number of pages for each book. Here's output from analyzing the data collected in the Corr/Regression applet. Assume, for now, that the normal approximation-based method is valid, and thus, the p-value from the Regression Table below is valid.
- Using the information available, fill in the blanks to write the equation of the regression line as estimated from the data.
_____(1)_____ + ____(2)_____
LO: 10.5-1; Difficulty: Easy; Type: TE-N
- What is the most appropriate null hypothesis for testing the regression slope using theory-based methods?
- There is no association between number of pages in a textbook and price of the textbook.
- There is an association between number of pages in a textbook and price of the textbook.
- There is no linear association between number of pages in a textbook and price of the textbook.
- There is a linear association between number of pages in a textbook and price of the textbook.
- True or False: We can conclude that the price of a textbook will increase if we add more pages.
LO: 10.5-1; Difficulty: Hard; Type: TF
- A 95% confidence interval for the population slope is (0.108, 0.187). Which of the following statements are valid based on this interval?
- If we repeated this study many times, 95% of the regression slopes would fall between 0.108 and 0.187.
- There is a 95% probability that the population slope is between 0.108 and 0.187.
- We are 95% confident that the population slope is between 0.108 and 0.187.
- We have significant evidence that the population slope is greater than zero.
- Both A and B.
- Both B and C.
- Both C and D.
Questions 73 through 75: A student in an AP Statistics class decided to conduct a study to determine whether you could predict the number of followers a teen has on Instagram based on the number of people he or she is following. To do this, she randomly selected fifty students from her high school that had Instagram accounts and for each student recorded the number of people they were following and the number of followers they had. A scatterplot of the data is shown.
The regression line is
.
- We want to test: H0: β = 0, Ha: β ≠ 0. Use results from the null distribution of simulated slopes shown to determine the standardized statistic.
LO: 10.5-1; Difficulty: Medium; Type: TE-N
- Based on the standardized statistic, is there strong evidence of an association between the number of followers and the number of people following an Instagram account?
- No, since the mean of the null distribution is zero.
- Yes, since the regression slope is positive.
- Yes, since the standardized statistic is greater than 3.
- Yes, since the correlation is positive.
- A 95% confidence interval for the population slope is (1.07, 1.35). How would you interpret this in the context of the study?
- We are 95% confident that the population slope is between 1.07 and 1.35.
- If we repeated this study many times, 95% of the regression slopes would fall between 1.07 and 1.35.
- There is a 95% probability that the population slope is between 1.07 and 1.35.
- We are 95% confident that a one person increase in the number of people one is following on Instagram is associated with between a 1.07 to 1.35 person increase in the number of followers.
- Both A and D.
- Both A and B.
- Both B and C.
Document Information
Connected Book
Test Bank + Answers | Statistical Investigations 2e
By Nathan Tintle