Ch6 Exam Questions Correlation Vs. Causality In Regression - Predictive Analytics 1e Complete Test Bank by Jeff Prince. DOCX document preview.

Ch6 Exam Questions Correlation Vs. Causality In Regression

Predictive Analytics for Business Strategy, 1e (Prince)

Chapter 6 Correlation vs. Causality in Regression Analysis

1) The part of the outcome that we can explicitly determine, fi (X1i, X2i, ..., Xki), is known as the:

A) unbiased estimate.

B) determining function.

C) covariance.

D) partial correlation.

2) In the determining function for Sales given by Salesi = α0 + α1Pricei + Ui, what is the role of Ui?

A) Factors other than price that impact sales

B) Factors correlated with price that do not impact sales

C) Factors uncorrelated with price that do not impact sales

D) Pure measurement error in price

3) In the determining function for Labor productivity given by Labor Prodi = α0 + α1Agei + α2Educationi + Ui, what is the role of Ui?

A) Factors other than age and education that affect labor productivity

B) Factors correlated with age but not education, and do not impact labor productivity

C) Factors uncorrelated with age and education, and do not impact labor productivity

D) The joint impact of age and education on labor productivity

4) Consider the following proposed determining function for the winning percentage of baseball teams WinPcti = 0.500 - 0.1 Team ERAi + 0.2 Team BAi + Ui, where Team ERA is the team earned run average, and Team BA is the team batting average. What is the effect (change) on winning percentage from an increase in Team ERA from 2.3 to 3.3?

A) 0.510

B) 0.600

C) A decrease of 0.1 in winning percentage

D) There is not enough information

5) Consider the following proposed determining function for the number of views on the New York Times home webpage for a day, i. NumViewsi = 1400 - 700 Weekend Dayi + 200 Election Yeari + Ui, where weekend day is binary variable for if day i is a Saturday or Sunday, and election year is a binary variable for if day i is in an election year. Derive the formula for the change in views with respect to a change in going from a weekday to a weekend (holding Election Year constant).

A) Views increase by 200.

B) Views decrease by 200.

C) Views increase by 1400.

D) Views increase by 700.

6) Consider the following proposed determining function for a student's grade on the econometrics final, FinalGradei = 68 + 4 HoursStudiedi - 2 Num Other Finalsi + Ui, where hours studied is the number of hours studied during finals week by student i, and number of other finals is the number of other finals student i has during finals week. Derive the formula for the change in a student's final grade with respect to a unit change in the number of other finals the student has in finals week.

A) Final Grade equals 66.

B) Final Grade increases by 4.

C) Final Grade decreases by 2.

D) Final Grade increases by 2.

7) Consider the following proposed determining function for a student's grade on the econometrics final, FinalGradei = 68 + 4 HoursStudiedi - 0.5 Num Other Finalsi2 + Ui, where hours studied is the number of hours studied during finals week by student i, and number of other finals squared is the number of other finals student i has during finals week squared. Derive the formula for the change in a student's final grade with respect to a unit change in the number of other finals the student has in finals week.

A) Final Grade decreases by 0.5.

B) Final Grade increases by 4.

C) Final Grade increases by 0.5 × (Num Other Finals).

D) Final Grade decreases by Num Other Finals.

8) If the determining function for Sales is given by Salesi = α0 + α1Pricei + Ui, what will the correlation between Sales and Price be?

A) α0 + α1

B) α1

C) Cor(Pricei, Ui)

D) There is not enough information.

9) The distinction between causality and correlation is best described as:

A) correlation implies a change in one variable creates a change in another, causality measures co-movement.

B) causality implies positive amounts of co-movement, correlation implies negative amounts of co-movement.

C) causality implies a change in one variable creates a change in another, correlation implies variables move together.

D) None of the answers is correct.

10) Which of the following formulas is the correct calculation for the unconditional correlation between variables X and Y?

A)

B)

C)

D)

11) The correlation between X and Y holding at least one other variable constant is known as:

A) partial correlation.

B) unconditional correlation.

C) semi-partial correlation.

D) linear regression.

12) The correlation between X and Y holding at least one other variable constant for only X or only Y is known as:

A) partial correlation.

B) unconditional correlation.

C) semi-partial correlation.

D) linear regression.

13) To calculate the partial correlation of Coconut Milk household purchases CMi and Salami Deli Meat household purchases Si controlling for household income Yi, (i.e., pCorr(CMi,Si;Yi)), which of the following regressions will be run:

A) CMi = b0 + b1Si + ei

B) CMi = b0 + b1Yi + ei

C) Si = b0 + b1CMi + ei

D) Yi = b0 + b1Si + ei

14) To calculate the semi-partial correlation of Coconut Milk household purchases CMi and Salami Deli Meat household purchases Si controlling for household income Yi,

(i.e., pCorr(CMi,Si(Yi)), which of the following regressions will be run:

A) CMi = b0 + b1Si + ei

B) CMi = b0 + b1Yi + ei

C) Si = b0 + b1CMi + ei

D) Si = b0 + b1Yi + ei

15) In calculating the partial correlation between two variables holding one other variable constant, how many sets of residuals will be created?

A) 1

B) 2

C) 3

D) None of the answers is correct.

16) Which of the following objects must be the same sign as the covariance of variables X and Y?

A) Unconditional correlation of X and Y.

B) Partial correlation of X and Y holding Z constant.

C) Semi-partial correlation of X and Y holding Z constant.

D) Semi-partial correlation of Y and X holding Z constant.

17) Suppose one runs the regression of Y on X1 and X2 and the coefficient on X2 is positive. Which of the following correlation conditions must hold in the sample?

A) Y and X1 have a positive unconditional correlation.

B) Y and X2 have a positive unconditional correlation.

C) The partial correlation between Y and X2 holding X1 constant must be positive.

D) The semi-partial correlation between Y and X2 holding X1 constant must be positive.

18) Suppose one runs the regression of Y on X1 and X2 and both coefficients on X1 and X2 are positive. All of the following correlation conditions must hold in the sample except for which one?

A) The semi-partial correlation between Y and X2 holding X1 constant must be positive.

B) The semi-partial correlation between Y and X1 holding X2 constant must be positive.

C) The partial correlation between Y and X2 holding X1 constant must be positive.

D) None of the answers is correct.

19) A consistent estimator is an estimator whose:

A) realized value gets close to the population parameter as sample size grows.

B) standard errors are small.

C) expected value is an unbiased estimate of the true population parameter.

D) efficiency is achieved in small samples.

20) If the residuals of a regression model, Yi = B + MXi + Ei, satisfy the condition that their variances are constant across all values of X, then they are said to be:

A) heteroscedastic.

B) homoscedastic.

C) exogenous.

D) endogenous.

21) If the residuals of a regression model, Yi = B + MXi + Ei, are such that their variances vary across all values of X, then they are said to be:

A) heteroscedastic.

B) homoscedastic.

C) exogenous.

D) endogenous.

22) Suppose that the following regression equation best describes the co-movement between Sales, Price and Number of Competitors: Salesi = B + M1Pricei + M2NumCompi. What moment condition would not be used to yield a consistent estimate of B, M1, M2?

A) = 0

B) = 0

C) = 0

D) = 0

23) Which of the following conditions is necessary for the sample estimates of the population coefficients that best describe the co-movement amongst the variables to be consistent?

A) The sample is a random sample.

B) The errors are distributed normally.

C) The errors are homoscedastic.

D) The errors are heteroscedastic.

24) Which of the following conditions ensures that the estimates of the coefficients for the population regression equation are distributed normally?

A) The residuals are distributed normally.

B) The residuals are homoscedastic.

C) The residuals are heteroscedastic.

D) The sample is "large" enough.

25) If the t-stat for the sample estimate of a coefficient, M1 calculated as the following, t = , where m1 is the estimated coefficient and S is the properly estimated standard error for the coefficient, comes out to be 2.7, what is the appropriate conclusion?

A) Reject the null hypothesis that the population coefficient M1 = 0 at a 99% confidence level.

B) Fail to reject the null hypothesis that the population coefficient M1 = 1 at a 95% confidence level.

C) Reject the null hypothesis that the population coefficient M1 = 2 at a 99% confidence level.

D) None of these choices are correct.

26) If the t-stat for the sample estimate of a coefficient, M1 calculated as the following,

t = , where m1 is the estimated coefficient and S is the properly estimated standard error for the coefficient, comes out to be 2.7, what is the appropriate conclusion?

A) Reject the null hypothesis that the population coefficient M1 = 1 at a 99% confidence level.

B) Fail to reject the null hypothesis that the population coefficient M1 = 1 at a 95% confidence level.

C) Reject the null hypothesis that the population coefficient M1 = 2 at a 99% confidence level.

D) None of these choices are correct.

27) Suppose the regression of Output per Hour on employee Age yields a coefficient of -4.3 with a standard error of 1.2. Which of the following equations would properly report the 95% confidence interval for the population coefficient?

A) -4.3 ± 1.96

B) 0 ± 1.96(1.2)

C) -4.3 ± 1.96(1.2)

D) -4.3 ± 1.65(1.2)

28) In making passive predictions, it is usually sufficient to (consistently) estimate what sort of relationships?

A) Causal relationships

B) Determining functions

C) Co-movement or partial correlations

D) None of the answers is correct.

29) When making passive predictions, which type of relationship must you believe continues to hold when you move your estimation sample to your forecasting sample?

A) The (semi) partial correlations of your dependent and independent variables

B) The determining function

C) Random treatment assignment

D) Selection bias

30) When making passive predictions, it is not important to be able conclude that:

A) your estimate of the coefficients is a consistent estimate.

B) your estimate describes a causal relationship between treatments and the outcome.

C) your estimates can be evaluated against a range of null hypothesis.

D) you are using a random sample of the target population.

31) Which of the following settings best describes a setting in which one would be making a passive prediction?

A) Predicting the rise in sales following an increase in the promotional activities of your firm.

B) Predicting the increase in labor productivity following the rollout of mandatory HR workshop.

C) Predicting vehicle sales as a function of the daily max temperature and total rainfall.

D) Predicting the change in click-through rates following the change in banner size.

32) Suppose you estimate the following regression of a firm's Sales and number of employees at each location across the country: Sales = 95,342 + 0.76 Number of Employees. You are willing to believe you have a random sample of store locations, and a sufficiently large sample size. Which statements are not yet justified by the regression results?

A) On average a store with 100 employees is likely to have Sales of about = 95,342 + 0.76(100) = 95,418

B) On average, stores with more sales have more employees.

C) Increasing the number of employees at a store by 2 will raise sales by 0.76 × 2 = 1.52

D) None of the answers is correct.

33) Suppose you estimate the following regression of a Movie's ticket sales and the season of year (Summer =1 if in May, June, July, August, =0 otherwise) that movie's initial release was in: Sales = 295,342 + 40.24 Summer. You are willing to believe you have a random sample of store locations, and a sufficiently large sample size. Which statements are well justified by the regression results?

A) Movies released in the summer tend to have higher ticket sales.

B) Moving the scheduled release of a movie from the winter to the summer will increase its sales by 40.24.

C) Moving the Jurassic World release date to Christmas instead of 4th of July weekend, would have resulted in ticket sales of 295,342.

D) Promotional activities increase the number of movie tickets sold holding other factors fixed.

34) Why can't the regression coefficients in correlation analysis be interpreted as the causal effect of a treatment on the outcome?

A) Co-movement/correlation amongst two variables could be generated by their relationship to other variables.

B) Regression coefficients in correlation analysis cannot be consistent.

C) Regression coefficients in correlation analysis cannot be unbiased.

D) You cannot conduct hypothesis tests on coefficients in correlation analysis.

35) After estimating a regression of your firm's store sales and the number of local competitors as follows: Sales = 321,752 + 70.35 Number of Competitors. You are willing to believe you have a random sample of store locations, and a sufficiently large sample size. What is wrong with the following logic, "We need more competitors to enter the markets we're in, so that our sales will rise?"

A) The positive coefficient on number of competitors is not a causal estimate.

B) The positive coefficient on number of competitors isn't a consistent estimate.

C) The number of competitors might be measured with error.

D) You're making an active prediction.

36) After estimating a regression of each employee's number of contracts sold and their tenure (in number of years) at the company as follows: Contracts = 30.5 + 4.5 Tenure. You are willing to believe you have a random sample of employees, and a sufficiently large sample size. Given these results, is it appropriate to make the following claim, "Our more tenured employees at the company on average get awarded more contracts"?

A) No, this is an active prediction.

B) Yes, you're making a passive prediction.

C) Yes, you're making an active prediction.

D) No, you're making a passive prediction.

37) In characterizing determining functions, the "error term" represents:

A) unobserved factors that determine the outcome.

B) factors that correlate with the treatment.

C) deviations from random sampling.

D) deviations from random treatment assignment.

38) In the following determining function, Earningsi = α0 + α1Educationi + Ui, what might be a factor contained in Ui?

A) Predictions of earnings

B) Innate ability, which might be correlated with education and earnings

C) Proximity to a 4-year college, which is known to not impact earnings

D) Heteroscedasticity

39) If the determining function for the likelihood of a mortgage application receiving a loan is given by, Loan Successful = 0.60 + 0.01 Credit Score. What would be the causal effect of increasing your credit score by 10 points?

A) Your expected probability of getting a loan would be 0.60 + 0.01(10) = 0.61.

B) Your expected probability of getting a loan would be 0.60

C) Your expected probability of getting a loan would increase by 0.01(10) = 0.10

D) Your expected probability of getting a loan would increase by 10.

40) Which of the following is not a condition to estimate a model for causality?

A) E[Ui] = E[UiX1i] = ... E[UiXki] = 0.

B) The sample is a random sample from the target population.

C) The error terms are distributed normally.

D) The data generating process can be described by a determining function such as Yi.β0 + β1X1i + ...+ βkXki + Ui.

41) Beyond the conditions required for consistent estimation of a model for causality, in order to conduct inference what additional assumption is required?

A) The errors are distributed normally.

B) The errors are distributed as the student T distribution.

C) The size of the sample is sufficiently large (e.g., 30(K + 1), where K is the number of coefficients).

D) None of the answers is correct.

42) In estimating the causal relationship between Sales and Price in the following determining function Salesi = β0 + β1Pricei + Ui, what assumption in addition to E [Ui ] = 0, justifies that the estimated coefficient on Price can be interpreted as an estimate of the causal effect of Price on Sales?

A) E [PriceiUi ] = 0

B) E [SalesiUi ] = 0

C) E [PriceiUi ] = β0

D) E [PriceiUi ] = β1

43) The assumption of no correlation between the error term (U) and treatment(s) (X) is similar to the assumption of what aspect used in the scientific method?

A) Random sample from the target population

B) Stating a question

C) Making sure to have a control group

D) Random assignment of the treatment

44) What are the two key assumptions necessary to establish causality from a regression model?

A) Random sample of participants and targeted assignment of the treatment

B) Random sample of participants and random assignment of the treatment

C) Targeted sample of participants and random assignment of the treatment

D) Targeted sample of participants and targeted assignment of the treatment

45) Suppose you've run a regression relating Revenues to TV Ads and Online Ads. You are willing to make the necessary assumptions to deduce causality and run hypothesis tests. Your results are as follows:

 

Coefficients

Standard Error

t-Stat

P-value

Intercept

5988.043107

2202.136765

2.719196738

0.006881952

TV Ads

199.6320212

27.82658673

7.174146912

4.63154E-12

Online Ads

53.52092429

41.0818115

1.302788809

0.193533758

If you tested the null hypothesis that Online Ads have no impact on Revenues at the 90% confidence level (i.e., 90% degree of support), you would:

A) Reject, and conclude Online Ads do impact Revenues

B) Fail to reject, and conclude Online Ads do impact Revenues

C) Fail to reject, and conclude there is insufficient evidence to establish that Online Ads impact Revenues

D) None of the answers is correct

46) Which of the following settings best describes a setting in which one would be making an active prediction?

A) Predicting the wholesale price of electricity as a function of the average daily temperature.

B) Predicting the rate of defaults amongst mortgages and how it relates to local business conditions.

C) Predicting vehicle sales as a function of the daily max temperature and total rainfall.

D) Predicting the change in click-through rates following the change in banner size.

47) When making active predictions, it is important to be able to conclude that:

A) your estimate of the coefficients is a causal estimate.

B) your estimate describes a linear relationship between treatments and the outcome.

C) your estimates can be evaluated against a range of null hypothesis.

D) you are using a random sample of the target population.

48) If you are comfortable with assumptions required for causal analysis and you have estimated the relationship between Sales and running a promotion together with a price discount to be Salesi = 140.3(60) + 4.3(0.7) Promo with standard errors reported in parenthesis. What would you predict to occur in the event of running a discount next week?

A) That it will increase sales, but you cannot reject the null hypothesis that it won't have any effect at the 90% confidence level.

B) That it will decrease sales, and that this has support at the 90% confidence level.

C) That it will increase sales by 4.3, and you are 90% confident that Promotions have some effect on Sales.

D) You are 90% confident that it will increase sales by 4.3.

49) The total sum of squares is given by the sum of:

A) the squares for the outcome (or dependent variable of your model).

B) squares for the treatment (or independent variable of your model).

C) squared residuals.

D) the squared difference between each observation Y and the average value for Y.

50) The R-squared of a regression is 1 - X, where X is the:

A) sum of squared residuals divided by the total sum of squares.

B) sum of squared predictions divided by the total sum of squares.

C) sum of squared residuals divided by the sum of squared predictions.

D) sum of squared predictions divided by the sum of squared residuals.

51) Under which sort of prediction is R-squared particularly informative of the value of the regression results?

A) Active prediction

B) Causal analysis

C) Treatment effects

D) Passive prediction

52) Suppose you are given a set of regression results on the role of a random treatment assignment experiment on the effect of an ad campaign on consumers' willingness to buy your product. The regression results report an R-squared of 0.04. How should you value the regression results?

A) Not highly, the R-squared is far too low.

B) Highly, despite the low R-squared it is likely the effect estimated is causal and valuable for active prediction.

C) Highly, the R-squared value is still high enough to justify that the effect is causal.

D) Not highly, the implied correlation of the R-squared is too low.

Document Information

Document Type:
DOCX
Chapter Number:
6
Created Date:
Aug 21, 2025
Chapter Name:
Chapter 6 Correlation Vs. Causality In Regression Analysis
Author:
Jeff Prince

Connected Book

Predictive Analytics 1e Complete Test Bank

By Jeff Prince

Test Bank General
View Product →

$24.99

100% satisfaction guarantee

Buy Full Test Bank

Benefits

Immediately available after payment
Answers are available after payment
ZIP file includes all related files
Files are in Word format (DOCX)
Check the description to see the contents of each ZIP file
We do not share your information with any third party