Exam Prep Chapter 10 Multiple Regression - Download Test Bank | Unlocking Statistics 3e by Robin H. Lock. DOCX document preview.
Statistics - Unlocking the Power of Data, 3e (Lock)
Chapter 10 Multiple Regression
10.1 Multiple Predictors
Use the following to answer the questions below:
The ANOVA table from a multiple regression analysis is provided.
Source | DF | SS | MS | F | P |
Regression | 4 | 10723 | 2680.75 | 3.751 | 0.014 |
Residual Error | 30 | 21442 | 714.73 | ||
Total | 34 | 32165 |
1) How many predictors are in the model?
Diff: 2 Type: SA Var: 1
L.O.: 10.1.0
2) How large is the sample size?
Diff: 2 Type: SA Var: 1
L.O.: 10.1.0
3) Compute for this model. Round to three decimal places.
A) 0.333
B) 0.667
C) 0.501
D) 0.083
Diff: 2 Type: BI Var: 1
L.O.: 10.1.4
10.2 Checking Conditions for a Regression Model
Use the following to answer the questions below:
While many people count calories, some often don't think about calories in the beverages they consume. Starbucks, one of the leading coffeehouse chains, provides nutrition information about all of their beverages on their website. Nutrition information, including number of calories, fat (g), carbohydrates (g), and protein (g), was collected on a random sample of Starbucks' 16 ounce ("Grande") hot espresso drinks. Note that all of the drinks in the sample are made with 2% milk unless the name specifically included the term "Skinny," which is how Starbucks indicated a beverage made with nonfat milk.
The regression equation is
Calories = 6.7 + 9.61 Fat (g) + 3.43 Carbs (g) + 4.42 Protein (g)
Predictor Coef SE Coef T P
Constant 6.68 17.64 0.38 0.715
Fat (g) 9.609 1.452 6.62 0.000
Carbs (g) 3.4350 0.3155 10.89 0.000
Protein (g) 4.418 2.231 1.98 0.083
S = 13.2293 R-Sq = 98.8% R-Sq(adj) = 98.4%
Analysis of Variance
Source DF SS MS F P
Regression 3 116867 38956 222.58 0.000
Residual Error 8 1400 175
Total 11 118267
1) The "Caramel Macchiato" was one of the drinks selected for the sample. When made with 2% milk, a grande Caramel Macchiato has 7 grams of fat, 34 grams of carbohydrates, and 10 grams of protein. Predict the number of calories in a Caramel Macchiato. Round to two decimal places.
A) 234.79 calories
B) 235.00 calories
C) 347.79 calories
D) 241.60 calories
Diff: 2 Type: BI Var: 1
L.O.: 10.1.1
2) Interpret the coefficient of Fat in context.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.1
3) How many drinks were used in this sample?
A) 12
B) 11
C) 10
D) 9
Diff: 2 Type: BI Var: 1
L.O.: 10.1.0
4) Interpret for this model.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.4
5) Is the model effective according to the ANOVA test? Use a 5% significance level. Include all details of the test.
: = = = 0 (The model is ineffective and all of the predictors can be omitted.)
: At least one ≠ 0 (At least one of the predictors in the model is effective.)
F = 222.58
p-value ≈ 0
There is very strong evidence that at least one of the predictors in the model is effective for explaining the number of calories in Starbucks hot espresso drinks.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.3
6) Which predictors are significant at the 5% level?
A) Fat and Carbs
B) Fat
C) Carbs
D) Fat, Carbs, and Protein
Diff: 2 Type: BI Var: 1
L.O.: 10.1.2
7) A dotplot of the residuals and a scatterplot of the residuals versus the predicted values are provided. Discuss whether the conditions for a multiple linear regression are reasonable by referring to the appropriate plots.
There is no obvious pattern in the scatterplot indicating that the linearity and consistent variability conditions are reasonably satisfied.
Overall, we might have some minor concerns about using multiple regression because of the outlier.
Diff: 2 Type: ES Var: 1
L.O.: 10.2.1;10.2.2
8) Which of the following scatterplots of the residuals versus the predicted values does not indicate problems with either the linearity or the consistent variability conditions?
A) A
B) B
C) C
Diff: 2 Type: BI Var: 1
L.O.: 10.2.2
Use the following to answer the questions below:
Output for a model to predict the GPAs of students at a small university based on their Math SAT scores, Verbal SAT scores, and the number of hours spent watching television in a typical week is provided.
The regression equation is
GPA = 1.80 + 0.00104 Math SAT + 0.00142 Verbal SAT - 0.0147 TV
Predictor | Coef | SE Coef | T | P |
Constant | 1.8015 | 0.1842 | 9.78 | 0.000 |
Math SAT | 0.0010442 | 0.0002500 | 4.18 | 0.000 |
Verbal SAT | 0.0014182 | 0.0002398 | 5.91 | 0.000 |
TV | -0.014708 | 0.003269 | -4.50 | 0.000 |
S = 0.366780 R-Sq = ????% R-Sq(adj) = 19.0%
Analysis of Variance
Source | DF | SS | MS | F | P |
Regression | ? | 14.4886 | 4.8295 | 35.90 | 0.000 |
Residual Error | ? | 59.7304 | 0.1345 | ||
Total | 447 | 74.2190 |
9) Predict the GPA of a student at this university with a Math SAT score of 600, a Verbal SAT score of 580, and who watches 5 hours of television in a typical week. Round to three decimal places.
A) 3.174
B) 3.233
C) 3.248
D) 3.142
Diff: 2 Type: BI Var: 1
L.O.: 10.1.1
10) Interpret the coefficient of TV in context.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.1
11) The for this model is missing in the provided output. Use the available information to compute (round to three decimal places) for this model.
A) 0.195
B) 0.243
Diff: 2 Type: BI Var: 1
L.O.: 10.1.4
12) Use the output to determine how many students were included in the sample.
Diff: 2 Type: SA Var: 1
L.O.: 10.1.0
13) Some of the information in the ANOVA table is missing. How many degrees of freedom should appear in the "Regression" row of the table?
Diff: 2 Type: SA Var: 1
L.O.: 10.1.0
14) Some of the information in the ANOVA table is missing. How many degrees of freedom should be listed in the "Residual Error" row?
Diff: 2 Type: SA Var: 1
L.O.: 10.1.0
15) At the 5% significance level, is the model effective according to the ANOVA test. Include all details of the test.
: = = = 0 (or The model is ineffective and all predictors can be omitted.)
: At least one ≠ 0 (or At least one of the predictors in the model is effective.)
F = 35.90
p-value ≈ 0
There is very strong evidence that at least one of the predictors in the model is effective for explaining the GPA of students at this university.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.3
16) Which predictors are significant at the 5% level?
A) Math SAT, Verbal SAT, and TV
B) Verbal SAT, and TV
C) Math SAT, Verbal SAT
D) Math SAT, and TV
Diff: 2 Type: BI Var: 1
L.O.: 10.1.2
17) A dotplot of the residuals and a scatterplot of the residuals versus the predicted values are provided. Discuss whether the conditions for a multiple linear regression are reasonable by referring to the appropriate plots.
The scatterplot of the residuals versus the predicted values shows no signs of violations of either the linearity or consistent variability conditions, thus we have no concerns about those.
Diff: 2 Type: ES Var: 1
L.O.: 10.2.1;10.2.2
10.3 Using Multiple Regression
Use the following to answer the questions below:
Fast food restaurants are required to publish nutrition information about the foods they serve. Nutrition information for a random sample of McDonald's lunch/dinner menu items (excluding sides and drinks) was obtained from their website. Output from a multiple regression analysis is provided.
The regression equation is Calories = 65.2 + 9.46 Total Fat (g) + 0.876 Cholesterol (mg) + 0.131 Sodium (mg)
Predictor | Coef | SE Coef | T | P |
Constant | 65.18 | 31.41 | 2.08 | 0.062 |
Total Fat (g) | 9.464 | 1.710 | 5.53 | 0.000 |
Cholesterol (mg) | 0.8762 | 0.6366 | 1.38 | 0.196 |
Sodium (mg) | 0.13149 | 0.04790 | 2.75 | 0.019 |
S = 39.4529 R-Sq = 95.5% R-Sq(adj) = 94.3%
Analysis of Variance
Source | DF | SS | MS | F | P |
Regression | 3 | 362171 | 120724 | 77.56 | 0.000 |
Residual Error | 11 | 17122 | 1557 | ||
Total | 14 | 379293 |
1) What are the explanatory variables used in this model?
A) Total Fat (g), Cholesterol (mg), and Sodium (mg)
B) Total Fat (g), Cholesterol (mg), Sodium (mg), and Calories
C) Total Fat (g) and Calories
D) Cholesterol (mg), Sodium (mg), and Calories
Diff: 2 Type: BI Var: 1
L.O.: 10.1.0
2) Use the provided output to determine how many menu items were included in the sample.
A) 12
B) 13
C) 14
D) 15
Diff: 2 Type: BI Var: 1
L.O.: 10.1.1
3) One of the menu items in the sample is the "McDouble," which has 390 calories, 12 grams of fat, 65 mg of cholesterol, and 850 mg of sodium. What is the predicted response for the McDouble? Round your answer to two decimal places.
Diff: 2 Type: SA Var: 1
L.O.: 10.1.1
4) One of the menu items in the sample is the "McDouble," which has 390 calories, 12 grams of fat, 65 mg of cholesterol, and 850 mg of sodium. What is the residual for the McDouble? Round your answer to two decimal places.
Diff: 2 Type: SA Var: 1
L.O.: 10.1.0;10.2.0
5) Which predictor appears to be the most important in this model? Explain briefly.
A) Total fat (g)
B) Cholesterol (mg)
C) Sodium (mg)
D) Calories
Diff: 2 Type: BI Var: 1
L.O.: 10.1.2
6) Interpret the coefficient of Sodium in context.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.1
7) Interpret for this model.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.4
8) At the 5% significance level, is the model effective according to the ANOVA test? Include all details of the test.
: = = = 0 (or Model is ineffective and all predictors could be omitted.)
: At least one ≠ 0 (or At least one predictor in the model is effective.)
F = 77.56
p-value ≈ 0 (using 3 and 11 degrees of freedom)
There is very strong evidence that this model (using total fat, cholesterol, and sodium) is effective for predicting the number of calories in McDonald's lunch/dinner menu items.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.3
9) Which predictors are significant at the 5% level? What are their p-values?
A) Total fat and sodium
B) Total fat, cholesterol, and sodium
C) Total fat
D) Cholesterol, and sodium
Diff: 2 Type: BI Var: 1
L.O.: 10.1.2
10) A boxplot of the residuals and a scatterplot of the residuals versus the predicted values are provided. Discuss whether the conditions for a multiple linear regression are reasonable by referring to the appropriate plots.
The scatterplot of the residuals versus the predicted values is fairly scattered. No curved or fanning patterns are apparent, indicating that the linearity and consistent variability conditions are reasonably satisfied.
Because all conditions are satisfied, multiple regression with these data is reasonable.
Diff: 1 Type: ES Var: 1
L.O.: 10.2.1;10.2.2
11) Which variable, if any, would you suggest trying to eliminate first to possibly improve this model? Describe one way in which you might determine if the model had been improved by removing that variable. Explain briefly.
1) — if removing cholesterol causes a large drop in this, we might prefer the model with cholesterol.
2) The residual standard error — if this is smaller without cholesterol in the model, it was a good idea to remove cholesterol.
3) Overall model p-value — if this is smaller without cholesterol, removing cholesterol was a good idea (this might be hard to compare as p-values are often quite small).
4) F-statistic from ANOVA — if this is larger without cholesterol, removing cholesterol from the model was a good idea.
5) Adjusted — if this does not decrease, removing cholesterol from the model was a good idea.
Diff: 2 Type: ES Var: 1
L.O.: 10.3.1
Use the following to answer the questions below:
Data were collected on the age (in years), mileage (in thousands of miles), and price (in thousands of dollars) of a random sample of used Hyundai Elantras. Output from two models are provided.
Single Predictor Model:
The regression equation is Price = 13.8 - 0.0912 Mileage
Predictor | Coef | SE Coef | T | P |
Constant | 13.7751 | 0.6930 | 19.88 | 0.000 |
Mileage | -0.091167 | 0.009718 | -9.38 | 0.000 |
Two Predictor Model:
The regression equation is Price = 15.2 - 0.0101 Mileage - 1.55 Age
Predictor | Coef | SE Coef | T | P |
Constant | 15.2174 | 0.6112 | 24.90 | 0.000 |
Mileage | -0.01005 | 0.01977 | -0.51 | 0.616 |
Age | -1.5466 | 0.3508 | -4.41 | 0.000 |
S = 1.39445 R-Sq = 89.0% R-Sq(adj) = 88.0%
Analysis of Variance
Source | DF | SS | MS | F | P |
Regression | 2 | 346.11 | 173.06 | 89.00 | 0.000 |
Residual Error | 22 | 42.78 | 1.94 | ||
Total | 24 | 388.89 |
12) What is the explanatory variable used in the single predictor model?
Diff: 2 Type: ES Var: 1
L.O.: 10.1.0
13) One of the cars in the sample was a 5-year-old Hyundai Elantra with 87,100 miles being sold for $6,000. What is the predicted price of this car using the single predictor model? Round to three decimal places.
Diff: 2 Type: SA Var: 1
L.O.: 10.1.1
14) One of the cars in the sample was a 5-year-old Hyundai Elantra with 87,100 miles being sold for $6,000. What is the predicted price of the car using the two predictor model? Round to three decimal places.
Diff: 2 Type: SA Var: 1
L.O.: 10.1.0;10.2.0
15) Is mileage a significant single predictor of the price of used Hyundai Elantras? Use α = 0.05. Include all details of your test.
: = 0 (or Mileage is not an effective predictor of
the price of used Hyundai Elantras.)
: ≠ 0 (or Mileage is an effective predictor of the price of
used Hyundai Elantras.)
t = -9.38
p-value ≈ 0
There is very strong evidence that mileage is a significant single predictor of the price of used Hyundai Elantras.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.2
16) Explain why Age is a potential confounding variable in the relationship between Age and Price of used Hyundai Elantras.
Diff: 2 Type: ES Var: 1
L.O.: 10.3.3
17) Is the two predictor model effective according to the ANOVA test? Use α = 0.05. Include all details of the test.
: = = 0 (The model is ineffective and all predictors could be omitted.)
: At least one ≠ 0 (At least one predictor in the model is effective.)
F = 89.0
p-value ≈ 0 (using 2 and 22 degrees of freedom)
There is very strong evidence that at least one predictor in the model is effective for explaining the price of used Hyundai Elantras.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.3
18) Is mileage a significant predictor of the price of used Hyundai Elantras, even after accounting for age?
A) Yes
B) No
Diff: 2 Type: MC Var: 1
L.O.: 10.1.2
19) Use the provided output to determine how many cars were in the sample.
A) 22
B) 23
C) 24
D) 25
Diff: 2 Type: BI Var: 1
L.O.: 10.1.0
20) A boxplot of the residuals and a scatterplot of the residuals versus the predicted values from the two predictor model are provided. Discuss whether the conditions for a multiple linear regression are reasonable by referring to the appropriate plots.
There is no curved or fanning pattern in the scatterplot of the residual versus the predicted values, indicating that both the linearity and consistent variability conditions are reasonably satisfied.
There should be no major concerns about using multiple regression with these data.
Diff: 2 Type: ES Var: 1
L.O.: 10.2.1;10.2.2
21) Regression output for the model that only uses Age as a predictor in the model is provided. Assuming that the residuals for this single predictor model do not indicate any problems, is this model an improvement over the model that uses both Age and Mileage as predictors? Statistically justify your answer by discussing at least two quantitative criteria.
The regression equation is Price = 15.3 - 1.71 Age
Predictor | Coef | SE Coef | T | P |
Constant | 15.2912 | 0.5840 | 26.18 | 0.000 |
Age | -1.7126 | 0.1264 | -13.55 | 0.000 |
S = 1.37179 R-Sq = 88.9% R-Sq(adj) = 88.4%
Analysis of Variance
Source | DF | SS | MS | F | P |
Regression | 1 | 345.61 | 345.61 | 183.66 | 0.000 |
Residual Error | 23 | 43.28 | 1.88 | ||
Total | 24 | 388.89 |
Yes, this model is an improvement over the model that uses both Age and Mileage because
1) There are no insignificant predictors in the model.
2) The of this new model has barely changed (88.9% compare to 89% for the two-predictor model).
3) The residual standard error is lower for this model (1.372 compared to 1.394 for the two-predictor model).
4) The F-statistic from the ANOVA test is much larger for this model (183.66 compared to 89 for the two-predictor model).
5) The Adjusted is larger for this model (88.4% compared to 88% for the two-predictor model).
Diff: 2 Type: ES Var: 1
L.O.: 10.3.1
Use the following to answer the questions below:
A quantitatively savvy, young couple is interested in purchasing a home in northern New York. They collected data on houses that had recently sold in the two towns they are considering. The variables they collected are the selling price of the home (in thousands of dollars), the size of the home (in square feet), the age of the home (in years), and the town in which the house is located (coded 1 = Canton and 0 = Potsdam). Output from their multiple regression analysis is provided.
The regression equation is
Price (in thousands) = 69.2 + 0.0627 Size (sq. ft.) - 0.632 Age + 1.6 Town
Predictor | Coef | SE Coef | T | P |
Constant | 69.23 | 25.10 | 2.76 | 0.008 |
Size (sq. ft.) | 0.06267 | 0.01024 | 6.12 | 0.000 |
Age | -0.6319 | 0.1328 | -4.76 | 0.000 |
Town | 1.65 | 12.15 | 0.14 | 0.893 |
S = 40.0763 R-Sq = 59.3% R-Sq(adj) = 56.5%
Analysis of Variance
Source | DF | SS | MS | F | P |
Regression | 3 | 102936 | 34312 | 21.36 | 0.000 |
Residual Error | 44 | 70669 | 1606 | ||
Total | 47 | 173605 |
22) One of the houses they are considering is a 92-year-old, 1,742 square foot house in Canton. What is the predicted selling price of this house? Round to three decimal places.
Diff: 2 Type: SA Var: 1
L.O.: 10.1.1
23) One of the houses they are considering is a 62-year-old, 1,865 square foot house in Potsdam. What is the predicted selling price of this house? Round to three decimal places.
Diff: 2 Type: SA Var: 1
L.O.: 10.1.1
24) Interpret the coefficient of Age in context.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.1
25) Interpret the coefficient of Town in context.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.1;10.3.2
26) How many houses are used in this dataset?
A) 48
B) 47
C) 46
D) 45
Diff: 2 Type: BI Var: 1
L.O.: 10.1.0
27) Interpret for this model.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.4
28) Using α = 0.05, is the model effective according to the ANOVA test? Include all details of the test.
: = = = 0 (The model is ineffective and all predictors could be omitted.)
: At least one ≠ 0 (At least one of the predictors in the model is effective.)
F = 21.36
p-value ≈ 0
There is very strong evidence that at least one of the predictors used in this model is effective for explaining the selling prices of homes in this area.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.3
29) Which predictors are significant at the 5% level?
A) Size and Age
B) Size
C) Age
D) Size, Age, and Town
Diff: 2 Type: BI Var: 1
L.O.: 10.1.2
30) A dotplot of the residuals and a scatterplot of the residuals versus the predicted values are provided. Discuss whether the conditions for a multiple linear regression are reasonable by referring to the appropriate plots.
The scatterplot of the residuals versus the predicted values displays no patterns (curve or fanning) indicating that there are no problems with the linearity or consistent variability conditions.
Diff: 2 Type: ES Var: 1
L.O.: 10.2.1;10.2.2
31) Regression output for a model that omits Town as a predictor is provided. Assuming that the residuals for this reduced model do not indicate any problems with using multiple regression, is this model an improvement over the model that uses Size, Age, and Town as predictors? Statistically justify your answer by discussing at least two quantitative criteria.
The regression equation is
Price (in thousands) = 70.6 + 0.0624 Size (sq. ft.) - 0.635 Age
Predictor | Coef | SE Coef | T | P |
Constant | 70.56 | 22.84 | 3.09 | 0.003 |
Size (sq. ft.) | 0.062440 | 0.009994 | 6.25 | 0.000 |
Age | -0.6350 | 0.1294 | -4.91 | 0.000 |
S = 39.6368 R-Sq = 59.3% R-Sq(adj) = 57.5%
Analysis of Variance
Source | DF | SS | MS | F | P |
Regression | 2 | 102907 | 51453 | 32.75 | 0.000 |
Residual Error | 45 | 70698 | 1571 | ||
Total | 47 | 173605 |
1) There are no insignificant predictors in the model.
2) The is not change at all (59.3% in both models).
3) The residual standard error is lower in the two-predictor model (39.64 versus 40.08 in the three-predictor model).
4) The F-statistic is larger in the two-predictor model (32.75 versus 21.36 in the three-predictor model).
5) The Adjusted is larger in the two-predictor model (57.5% versus 56.5% in the three-predictor model).
Diff: 2 Type: ES Var: 1
L.O.: 10.3.1
Use the following to answer the questions below:
A small university is concerned with monitoring the electricity usage in its Student Center, and its officials want to better understand what influences the amount of electricity used on a given day. They collected data on the amount of electricity used in the Student Center each day and the daily high temperature for nearly a year. They also made note of whether each day was a weekend or not (1 = Saturday/Sunday and 0 = Monday - Friday). Regression output is provided.
Helpful notes: 1) Electricity usage is measured in kilowatt hours, 2) During the cold months, the Student Center is heated by gas, not electricity, and 3) Air conditioning the building during the warm months does use electricity.
The regression equation is Electricity = 83.6 + 0.529 High Temp - 25.2 Weekend
Predictor | Coef | SE Coef | T | P |
Constant | 83.560 | 4.238 | 19.72 | 0.000 |
High Temp | 0.52918 | 0.07020 | 7.54 | 0.000 |
Weekend | -25.168 | 3.724 | -6.76 | 0.000 |
S = 29.8162 R-Sq = 24.7% R-Sq(adj) = 24.2%
Analysis of Variance
Source | DF | SS | MS | F | P |
Regression | 2 | 90481 | 45241 | 50.89 | 0.000 |
Residual Error | 310 | 275592 | 889 | ||
Total | 312 | 366073 |
32) Predict the amount of electricity used on a Monday with a high temperature of 62°F. Use one decimal place in your answer.
A) 116.4 kilowatt hours
B) 91.2 kilowatt hours
C) 32.8 kilowatt hours
D) 141.6 kilowatt hours
Diff: 2 Type: BI Var: 1
L.O.: 10.1.1
33) Predict the amount of electricity used on a Saturday with a high temperature of 68°F. Use one decimal place in your answer.
A) 94.4 kilowatt hours
B) 119.6 kilowatt hours
C) 58.9 kilowatt hours
D) 92.4 kilowatt hours
Diff: 2 Type: BI Var: 1
L.O.: 10.1.1
34) Interpret the coefficient of High Temp in context.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.1
35) Interpret the coefficient of Weekend in context.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.1;10.3.2
36) How many days are included in the sample?
A) 365
B) 311
C) 312
D) 313
Diff: 2 Type: BI Var: 1
L.O.: 10.1.0
37) Interpret for this model.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.4
38) Is the model effective according to the ANOVA test? Use α = 0.05. Include all details of the test.
: = = 0 (or The model is ineffective and all predictors could be removed.)
: At least one ≠ 0 (or At least one of the predictors in the model is effective.)
F = 50.89
p-value ≈ 0
There is very strong evidence that at least one predictor in the model is effective for explaining electricity use at the university's Student Center.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.3
39) Which predictors are significant at the 5% level? What are their p-values?
Diff: 2 Type: ES Var: 1
L.O.: 10.1.2
40) Another possible predictor they recorded was the average temperature over the course of each day. Regression output for the model that uses High Temp, Weekend, and Avg. Temp is provided. Explain why these results differ so drastically from those for the two-predictor model.
The regression equation is
Sullivan Student Center = 81.9 + 0.839 High Temp - 25.1 Weekend - 0.337 Avg. Temp
Predictor Coef SE Coef T P
Constant 81.881 4.837 16.93 0.000
High Temp 0.8389 0.4351 1.93 0.055
Weekend -25.053 3.730 -6.72 0.000
Avg. Temp -0.3372 0.4673 -0.72 0.471
S = 29.8393 R-Sq = 24.8% R-Sq(adj) = 24.1%
Analysis of Variance
Source DF SS MS F P
Regression 3 90945 30315 34.05 0.000
Residual Error 309 275129 890
Total 312 66073
Diff: 3 Type: ES Var: 1
L.O.: 10.3.3
41) A histogram of the residuals and a scatterplot of the residuals versus the predicted values are provided. Discuss whether the conditions for a multiple linear regression are reasonable by referring to the appropriate plots.
However, there appears to be a "bend" in the scatterplot of the residuals versus the predicted values, indicating that the linearity condition is not satisfied. There doesn't seem to be an obvious problem with the consistent variability condition. Overall, we should be concerned that this multiple regression model is not appropriate.
Diff: 2 Type: ES Var: 1
L.O.: 10.2.1;10.2.2
Use the following to answer the questions below:
Is there such thing as a "home court/field advantage"? The number of points scored and whether or not it was a home game are available for a sample of games played by the Boston Celtics during the regular season. The Home variable is coded as 1 = home game and 0 = away game.
The regression equation is Points Scored = 102 - 8.76 Home
Predictor | Coef | SE Coef | T | P |
Constant | 102.091 | 3.842 | 26.57 | 0.000 |
Home | -8.758 | 5.728 | -1.53 | 0.144 |
S = 12.7430 R-Sq = 11.5% R-Sq(adj) = 6.6%
Analysis of Variance
Source | DF | SS | MS | F | P |
Regression | 1 | 379.6 | 379.6 | 2.34 | 0.144 |
Residual Error | 18 | 2922.9 | 162.4 | ||
Total | 19 | 3302.5 |
42) How many points are the Celtics predicted to score in a home game? Round to one decimal place.
A) 93.2 points
B) 110.8 points
C) 94.0 points
D) 111.8 points
Diff: 2 Type: BI Var: 1
L.O.: 10.1.1;10.3.2
43) How many points are the Celtics predicted to score in an away game? Round to one decimal place.
A) 102.0 points
B) 101.0 points
C) 93.2 points
D) 110.8 points
Diff: 2 Type: BI Var: 1
L.O.: 10.1.1;10.3.2
44) Interpret the for this model.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.4
45) Using α = 0.05, is there a difference in the number of points scored for home and away games? Include all details of the test.
: = 0 (or Home/away is not effective for explaining the number of points scored the Boston Celtics.)
: ≠ 0 (or Home/away is effective for explaining the number of points scored by the Boston Celtics.)
t = -1.53
p-value = 0.144
There is no evidence that home/away is effective for explaining the number of point scores by the Boston Celtics.
Since there is only one predictor in the model, the ANOVA F test could have been used instead of the t-test.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.2;10.1.3
Use the following to answer the questions below:
Does the price of used cars depend upon the model? Data were collected on the selling price and age of used Hyundai Elantras (coded as Model = 1) and Toyota Camrys (coded as Model = 0). Output from the multiple regression analysis is provided.
The regression equation is Price = 14.5 - 0.619 Age - 3.63 Model
Predictor | Coef | SE Coef | T | P |
Constant | 14.4648 | 0.7059 | 20.49 | 0.000 |
Age | -0.61922 | 0.04903 | -12.63 | 0.000 |
Model | -3.6343 | 0.7584 | -4.79 | 0.000 |
S = 2.63465 R-Sq = 69.3% R-Sq(adj) = 68.4%
Analysis of Variance
Source | DF | SS | MS | F | P |
Regression | 2 | 1142.13 | 571.06 | 82.27 | 0.000 |
Residual Error | 73 | 506.72 | 6.94 | ||
Total | 75 | 1648.85 |
46) What is the predicted price of a 6-year-old Hyundai Elantra? Round to three decimal places.
Diff: 2 Type: SA Var: 1
L.O.: 10.1.1
47) What is the predicted price of a 6-year-old Toyota Camry? Round to three decimal places.
Diff: 2 Type: SA Var: 1
L.O.: 10.1.1
48) Interpret the coefficient of Model in context.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.1;10.3.2
49) Interpret for this model.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.4
50) Is the model effective according to the ANOVA test? Use α = 0.05. Include all details of the test.
: = = 0 (or The model is ineffective and all predictors could be omitted.)
: At least one ≠ 0 (or At least one of the predictors in the model is effective.)
F = 82.27
p-value ≈ 0
There is very strong evidence that at least one of the predictors is effective for explaining the price of used cars.
Diff: 2 Type: ES Var: 1
L.O.: 10.1.3
51) Which predictors are significant at the 5% level? What are their p-values?
Diff: 2 Type: ES Var: 1
L.O.: 10.1.2
52) A histogram of the residuals and a scatterplot of the residuals versus the predicted values are provided. Discuss whether the conditions for a multiple linear regression are reasonable by referring to the appropriate plots.
There is a clear curved pattern in the residuals (and possible fanning pattern), indicating that the linearity condition, and possibly the consistent variability condition, are problematic.
Diff: 2 Type: ES Var: 1
L.O.: 10.2.1;10.2.2
© 2021 John Wiley & Sons, Inc. All rights reserved. Instructors who are authorized users of this course are permitted to download these materials and use them in connection with the course. Except as permitted herein or by law, no part of these materials should be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise.