Testbank

Chapter 1: Research and statistics

1. When employing inferential statistics, which tradition within theory of science do we adhere to?

a) Constructivism

b) Positivism

c) Hermeneutics

d) Interpretivism

2. What is the positivist assumption?

a) Observations and experience depend on the perspective of the observer

b) The patterns of interest are a product of our own making

c) The world consists of regularities that can be measured and explained

d) You cannot acquire knowledge from studying the world

3. In a normal distribution, what percentage of the observations fall within 1.96 standard deviations from the mean?

a) 90%

b) 95%

c) 97.5%

d) 99%

4. Within probability theory, what does probability (or p-) values tell us?

a) The probability of being wring when we confirm a null hypothesis

b) The probability of being correct when we reject a null hypothesis

c) The probability of being mistaken when we reject a null hypothesis

d) The probability of being right when we confirm a null hypothesis

5. When you are investigating a full population, you are generalizing within?

a) Stochastic model theory

b) Probability theory

c) Statistical theory

d) The law of large numbers

Chapter 2: Introduction to Stata

1. Which of the following do we use to type in commands in Stata?

a) Command window

b) Review window

c) Variables window

d) Do-file editor

2. Which of the following is the command to open a dataset?

a) pwd

b) describe

c) use

d) open

3. Which of the following is the command that will give us the mean of a variable?

a) describe

b) codebook

c) sum

d) mean

4. Which of the following codes are wrong?

a) gen age2==age*age

b) gen age2=age*age

c) keep if gender==2

d) keep if gender=2

5. Which of the following commands is used for combining datasets based on observations?

a) merge

b) list

c) reshape

d) append

Chapter 3: Simple (bivariate) regression

1. What does a simple linear regression analysis examine?

a) The relationship between only two variables

b) The relationship between one dependent and one independent variable

c) The relationship between many variables

d) The relationship between two dependent and one independent variable

2. Which of the following are correct?

a) The intercept/constant () is the mean-Y when X▓=▓0

b) The intercept is the amount of change in mean-Y when X▓=▓0

c) The coefficient is the mean-Y at a certain value of X

d) The coefficient () is the amount of change in mean-Y for every unit increase in X

3. What does the least squares method do exactly?

a) Minimizes the distance between data points

b) Finds the least problematic regression line

c) Finds those (best) values of the intercept and slope that provide us with the smallest value of the residual sum of squares

d) Finds those (best) values of the intercept and slope that provide us with the smallest value of the sum of residuals

4. Which of the following measures is optimal for comparing the goodness of the fit of competing regression models involving the same dependent variable?

a) The intercept

b) The coefficient

c) R-square

d) Standard deviation of the residuals

5. What does the following expression () mean?

a) Mean-Y changes as a result of a change in X

b) Mean-Y does not change as a result of change in X

c) Mean-Y value becomes 0 as a result of a change in X

d) Mean-Y value is equal to 0 when X▓=▓0

Chapter 4: Multiple regression

1. What does a multiple linear regression analysis examine?

a) The relationship between more than one dependent and only one independent variable

b) The relationship between one or more than one dependent and only one independent variable

c) The relationship between one dependent and more than one independent variables

d) The relationship between more than one independent variables

2. What does the following expression () mean?

a) One of the independent variables is useful in predicting the dependent variable

b) Both the independent variables are useful in predicting the dependent variable

c) None of the independent variables are useful in predicting the dependent variable

d) There is a third independent variable predicting the dependent variable

3. Which of the following criteria is the most optimal for assessing the goodness of the fit of a multiple regression model?

a) Adjusted

b)

c) The intercept

d) The coefficient

4. In which cases are the standardized coefficients suggested to be used to identify the relative importance of the independent variables in a multiple regression model?

a) When all the independent variables are measured using the same metric

b) When not all the independent variables are measured using the same metric

c) When the independent variables are measured using different metrics

d) When all the independent variables are measured using an ordinal scale ranging from 1 to 6

5. What is the post-estimation command that you can use after the regress command in Stata to compute the predicted mean-Y values of interest?

a) pcorr

b) esttab

c) margins

d) marginsplot

Chapter 5: Dummy-variable regression

1. What does a dummy-variable regression analysis examine?

a) The relationship between one continuous dependent and one continuous variable

b) The relationship between one categorical dependent and one continuous independent variable

c) The relationship between one continuous dependent and one categorical independent variable

d) The relationship between one continuous dependent and one dichotomous variable

2. Which of the following is incorrect?

a) Regression with one dummy variable (predictor) corresponds directly to an independent analysis of variance (ANOVA)

b) Regression with more than one dummy variable including a covariate corresponds directly to an independent analysis of covariance (ANVOCA)

c) Regression with more than one dummy variable (predictor) corresponds directly to an independent analysis of variance (ANOVA)

d) Regression with one dummy variable (predictor) corresponds directly to an independent t-test

3. Which of the following procedures can be used to compare the means of the included groups in a dummy-variable regression model?

a) Changing the reference group

b) Linear combination

c) Standardization

d) Not possible

4. Why is the number of dummy variables to be entered into the regression model always equal to the number of groups (g) minus 1 (g▓−▓1)?

a) To avoid model misspecification

b) To increase the R-squared value

c) To avoid the situation of perfect multicollinearity

d) To control other variables in the model

5. How do we interpret a dummy variable coefficient?

a) The difference between two means

b) The difference between two coefficients

c) The difference between two R-squared values

d) None of these

Chapter 6: Interaction/moderation effects using regression

1. What exactly is an interaction/moderation effect?

a) When a third variable (X₁) and an independent variable (X₂) affect the dependent variable (Y) simultaneously

b) When a third variable (X₁) reduces the effect of an independent variable (X₂) on the dependent variable (Y)

c) When a third variable (X₁) affects the relation between an independent variable (X₂) and the dependent variable (Y)

d) When a third variable (X₁) affects an independent variable (X₂) but not the dependent variable (Y)

2. What is a product term the result of?

a) Multiplying two variables

b) Taking the ratio of two variables

c) Subtracting one variable from another

d) Adding two variables

3. What is a simple main (conditional) effect?

a) The effect of X₂ on X₁ when Y is equal to 0

b) The effect of X₁ on X₂ when Y is equal to 0

c) The effect of X₂ on Y when X₁ is equal to 0

d) The effect of X₁ on Y when X₂ is equal to 0

4. When we estimate an interaction model with the mean-centered variables (X₁ and X₂), what will the coefficients on these predictors reflect?

a) The slope on X₁ for those having the mean score on X₂

b) The slope on X₂ for those having the mean score on X₁

c) The slope on X₂ and X₁ for those having the mean score on Y

d) None of these

5. Which of the following commands in Stata can we use to compute the simple main effects after the estimation of an interactive regression model?

a) margins

b) margins, dydx()

c) predict

d) compute dydx()

Chapter 7: Linear regression assumptions and diagnostics

1. Which one of these statements is not a Gauss–Markov assumption?

a) The error term has a conditional mean of 0

b) Influential observations are absent

c) The error term has constant variance

d) The errors are uncorrelated

2. Why should we not include irrelevant variables in our regression analysis?

a) Your R-squared will become too high

b) Because of data limitations

c) It is bad academic fashion not to base your variables on sound theory

d) We increase the risk of producing false significant results

3. How can we deal with the breach of the assumption about linearity?

a) Include a squared term

b) Include an interaction term

c) Use robust regression

d) Use the margins command

4. What is the best way to find the exact top or bottom point of a squared effect?

a) Through derivation using values from the two coefficients

b) Excluding the squared term and predicting

c) Including the squared term and predicting

d) Graphing the results and comparing the top/bottom point with the value on the X-axis

5. Name another way of modelling nonlinearity

a) Using the linktest command

b) Using interaction term

c) Using dummy variables

d) Using a bivariate regression model

6. Which statistic(s) can help us detect multicollinearity?

a) Variance inflation factor (VIF)

b) F-statistic

c) Durbin–Watson

d) Tolerance values (1/VIF)

7. What does heteroskedasticity mean?

a) The variance in the residuals is the same regardless of their predicted values

b) There is variance in the residuals

c) We are unable to produce residuals

d) The variance in the residuals differ depending on their predicted values

8. What are the two ways we can check for heteroskedasticity?

a) We can examine a plot of predicted values vs the residuals

b) We can run the Hausman test

c) We can run the hettest command

d) We can compare the F-test of two models

9. What does robust regression do?

a) Performs an OLS regression with more trustworthy standard errors

b) It gives a weight to each unit based on their distance from the mean of Y

c) Performs three types of regression analysis and presents the mean results

d) It gives a weight to each unit based on their total influence on the model

10. Which one is not a measure of influential (or potentially influential) observations?

a) Leverage

b) Cook–Weisberg

c) DFBETA

d) Cook’s distance

Chapter 8: Logistic regression

1. Who is considered to be the ‘inventor’ of logistic regression?

a) Thomas Malthus

b) Alphonse Quetelet

c) Pierre-François Verhulst

d) Karl Pearson

2. The logistic model is estimated by way of?

a) Ordinary least squares

b) Maximum likelihood estimation

c) Poisson distribution

d) Negative binomial distribution

3. In logistic regression, what do we estimate for a unit’s change in X?

a) The change in Y multiplied with Y

b) The change in Y from its mean

c) How much Y changes

d) How much the natural logarithm of the odds for Y▓=▓1 changes

4. A total predicted logit of 0 can be transformed to a probability of?

a) 0

b) 0.5

c) 1

d) 0.75

5. The log likelihood is analogous to?

a) The t-test in OLS regression

b) The F-test in OLS regression

c) The standardized coefficient in OLS regression

d) The Wald test

6. In categorical variables, when all, or close to all with a given X-value has the same value on Y, we call this a problem of?

a) Discrimination

b) Multicollinearity

c) Autocorrelation

d) Prediction

7. What makes the interpretation of conditional effects extra challenging in logistic regression?

a) It is not possible to model interaction effects in logistic regression

b) The results must be raised by its natural logarithm

c) The conditional effect is dependent on the values of all X-variables

d) The maximum likelihood estimation makes the results unstable

8. Which variant of logistic regression is recommended when you have a categorical dependent variable with more than two values?

a) Logistic regression

b) Multinomial logistic regression

c) Ordered logit regression

d) Poisson regression

Chapter 9: Survival analysis

1. In survival analysis, which technique do we use to deal with those subjects who do not experience the target event during the period of data collection?

a) Regression

b) Listwise deletion

c) Multiple imputation

d) Censoring

2. What information does the hazard function give us?

a) It shows the conditional probability that a subject will experience the event in a given time period

b) It shows the probability that a subject will survive past a given time period

c) It shows the likelihood that a subject will not experience the event in a given time period

d) It shows the probability that a subject will survive the period of data collection

3. When is a proportional hazard model (Cox) used?

a) When we wish to graph the survival function of our single event data

b) When we wish to investigate the effect of multiple covariates on the time for a certain event to take place

c) When we wish to test both time-invariant and time-varying covariates

d) When we wish to produce the cumulative survivor function

4. What is the PH-assumption?

a) The probability of Y▓=▓0 do not exceed 0.25

b) The hazard ratio increases over time

c) The hazard ratio decreases over time

d) The ratio of the hazard for subjects with different values on covariates is the same across time periods

5. What is it called when it is several possible different outcomes when moving from one state to another?

a) Observed event times

b) Multiple events

c) Competing risks

d) Cox paradox

Chapter 10: Multilevel analysis

1. What is the basic assumption of multilevel modelling?

a) The dependent variable is continuous

b) The expected value of Y can be modelled by a combination of unknown parameters

c) The units of analysis are measured over time

d) A unit at the lowest level is nested into a higher level of unit(s)

2. What is a level?

a) A level is a given variable chosen from your theoretical approach

b) A level is a variable that identifies units sampled from a population

c) A level can be any categorical variable in your dataset

d) A level can be any continuous variable in your dataset

3. How can we calculate how much of the variation in Y is situated at each level?

a) By dividing the log likelihood on the number of units in the respective levels

b) By running your full model and then calculating the intraclass correlation coefficient

c) By dividing the level-1 residual on each of the higher-level residuals

d) By running an empty model and then calculating the intraclass correlation coefficient

4. Which of the following are advantages of multilevel modelling?

a) It is the statistical method that comes closest to experiments in establishing causality

b) It takes into account the problem of dependency among observations

c) It allows us to model the influence of variables from all levels on our Y

d) It allows us to model risk development over time of an event taking place

5. What is a random coefficient (slope) model?

a) A multilevel model that allows both the intercept and coefficient(s) to vary

b) A multilevel model that allows the intercept to vary

c) A multilevel model with a fixed intercept and fixed coefficients

d) A multilevel model with statistical interaction

6. What is a cross-level interaction term?

a) A variable made by multiplying two variables situated at level-1

b) A variable made by multiplying two variables situated at level-2

c) A variable made up by log transforming any X-variable

d) A variable made by multiplying two variables situated at different levels

7. How many levels is it possible to model through multilevel modelling?

a) 1

b) 2

c) 3

d) There is no theoretical limit

8. How many identifier variables are required to run a three-level model?

a) 1

b) 2

c) 3

d) 4

9. When is it appropriate to use a cross-classified multilevel model?

a) When it is unclear which level-1 units belong to the different higher-level units

b) When level-1 units can be members of more than one higher-level unit at the same time

c) When the level-1 units can only be nested into level-3 units

d) If your regular model fails to converge

Chapter 11: Panel data analysis

1. What are panel data?

a) Data containing one unit measured at different time points

b) Data where each unit is measured at more than one time point

c) Data containing skewed variable distributions

d) Data measured at one point in time

2. What does it mean when we say that our panel is balanced?

a) When we have more time periods than units

b) When we have a large sample of units

c) When we have an equal number of time periods per unit

d) When we have an unequal number of time periods per unit

3. Which model takes into account the time-varying independent variables, the time-invariant independent variables and the unmeasured time-invariant variables?

a) Pooled OLS

b) Between effects

c) Fixed effects

d) Random effects

4. How is the time-invariant independent variables and the unmeasured time-invariant variables captured in a fixed effects model?

a) By running an ordinary least squares regression

b) By running an ordinary least squares regression with robust standard errors

c) By taking the mean of each variable for each unit across time, and running a regression on the collapsed dataset of means

d) By including unit dummy variables

5. What is the name of the statistical test that can help us determine whether to choose a fixed effects or a random effects model?

a) Hausman test

b) z-test

c) Chi-squared test

d) Link–Wallace test

6. Random effects is a weighted average between?

a) Pooled OLS and fixed effects

b) Pooled OLS and between effects

c) Between effects and fixed effects

d) Fixed effects and time-fixed effects

7. What characterizes time-series cross-section data?

a) When we have a large number of units recorded at few time points

b) When we have a small or medium sized number of units recorded at many time points

c) When we have a large number of units recorded at many time points

d) When we have a small or medium sized number of units recorded at few time points

8. What does non-stationarity mean?

a) Parameters of our data (such as the mean and variance) do not change over time

b) Parameters of our data (such as the mean and variance) do change over time

c) When time-series data are not influenced by their historical values

d) When time-series data are influenced by their historical values

9. What does lagging variables mean?

a) We use previous values of a variable

b) We use future values of a variable

c) We set all values at its mean across time

d) We log-transform the variable

Chapter 12: Time-series analysis

1. A long-term movement in the data (a systematic tendency for a series to either increase or decrease) is called a?

a) Trend

b) Cycle

c) Seasonality

d) Window

2. What is the definition of autocorrelation?

a) The data are hierarchically structured

b) There is absence of multicollinearity

c) The variance of the residuals is unequal over a range of measured values

d) The correlation of a series with its own lagged or future values

3. Which statistical test is commonly employed to detect autocorrelation?

a) Cook’s D

b) Durbin–Watson

c) Hausman

d) Huber–White

4. What is a stationary time series?

a) A series where the mean, variance and covariance does not change over time

b) A series where the mean, variance and covariance does change over time

c) A series where the probability distribution does not change over time

d) A series where the probability distribution does change over time

5. Which model can be used to include multiple time series?

a) Arima model

b) Vector autoregression

c) Unit root

d) Correlogram

Chapter 13: Exploratory factor analysis

1. Which of the following is not part of the exploratory factor analysis process?

a) Extracting factors

b) Determining the number of factors before the analysis

c) Rotating the factors

d) Refining and interpreting the factors

2. Which of the following statements is true?

a) The correlation matrix will have 1s in the diagonals in PCA and less than 1s in the EFA

b) The correlation matrix will have 1s in the diagonals both in PCA and EFA

c) The correlation matrix will have less than 1s in the diagonals both in PCA and EFA

d) The correlation matrix will have 1s in the diagonals in EFA and less than 1s in PCA

3. Which of the following criteria cannot be used to determine the number of factors in an EFA?

a) Asking a group of researchers before the analysis

b) Eigenvalue rule

c) Scree test

d) Parallel analysis

4. Which of the following is not an oblique rotation technique?

a) Promax

b) Oblimax

c) Varimax

d) Quartimin

5. What will a factor loading in an orthogonal solution represent?

a) Correlation

b) Partial correlation

c) Multiple correlation

d) Eigenvalue

Chapter 14: Structural equation modelling and confirmatory factor analysis

1. Which of the following is not a typical structural equation model?

a) Confirmatory factor analysis

b) Latent path analysis

c) Latent mean analysis

d) Exploratory factor analysis

2. Which of the following statements is wrong?

a) Confirmatory factor analysis (CFA) is a type of SEM

b) In CFA measurement error of indicators is removed during the estimation

c) Both in EFA and CFA we specify the pattern of indicator-factor loadings

d) CFA belongs to the common factor model family

3. Which of the following estimation methods are provided by Stata?

a) Maximum likelihood (ML)

b) Robust weighted least squares (WLSMV)

c) Maximum likelihood with missing values (MLMV)

d) Quasi-maximum likelihood (QML)

4. Which of the following criteria are used to assess the quality of a measurement model in SEM?

a) Fit indices

b) Size and significance of factor loadings

c) Discriminant validity

d) Latent path coefficients

5. Which of the following is not a typical model fit index used in SEM?

a) Root mean squared error of approximation (RMSEA)

b) Adjusted R-square

c) Comparative fit index (CFI)

d) Tucker–Lewis index (TLI)

Chapter 15: Advanced statistical techniques

1. If you have count data where the variance is smaller than the mean, however you have an excess of zero counts, which model should you choose?

a) Poisson regression

b) Zero-inflated Poisson regression

c) Negative binomial regression

d) Zero-inflated binomial regression

2. What is skewness?

a) A pointy distribution

b) A flat distribution

c) Lack of symmetry of a distribution

d) A platykurtic distribution

3. What happens if you raise a variable to a power (p) ▓<▓1

a) We will reduce negative skew

b) The data will stay the same

c) We will reduce positive skew

d) We will make a flat distribution pointy

4. Which one of the following methods is recommended to use when running an analysis containing missing data?

a) Pairwise deletion

b) Multiple imputation

c) Single imputation

d) Dummy variable adjustment

5. What does multiple imputation do?

a) Removes any observation that has missing one or more of the variables in the model

b) Uses information from all values from other variables to predict values on variable(s) with low N

c) Calculations are based on all available data pairwise for all pairs of variables

d) Inserts a new value for all missing observations in a variable (e.g., 0 or the mean), as well as including a dummy variable coded 1 if the original data is missing

Chapter 16: Programming and dynamic reporting using Stata

1. Which of the following statements is wrong?

a) Macros can contain a number

b) Macros can contain text

c) Macros can be local or global

d) Local macros can be created in one do-file and used in another

2. Which of the following commands can extract stored objects after an estimation?

a) breturn list

b) return list

c) ereturn list

d) dreturn list

3. Which of the following are the main components of the dyndoc command?

a) Markdown language

b) Put commands

c) Dynamic tags

d) Stata commands

4. Which of the following is not a put command?

a) putdocx

b) putpdf

c) docx2pdf

d) dyntext

5. Which of the following statement is wrong?

a) putdocx is less flexible than dyndoc

b) putdocx is less versbose than dyndoc

c) dyndoc should be used for comprehensive dynamic reports

d) Markdown commands cannot be used by the put commands

Document Information

Document Type:

DOCX

Chapter Number:

All in one

Created Date:

Aug 21, 2025

Chapter Name:

Docx Test Bank Applied Stats 2e Test Bank

Author:

Mehmet Mehmetoglu

Connected Book

Complete Test Bank | Applied Stats Using Stata 2e | Answers

By Mehmet Mehmetoglu

Test Bank General

View Product →

Explore recommendations drawn directly from what you're reading

Docx Test Bank Applied Stats 2e Test Bank

DOCX Ch. All in one Current

All chapters in this product are shown above

2nd Edition Docx Test Bank Applied Stats 2e Test Bank

Chapter 1: Research and statistics

Chapter 2: Introduction to Stata

Chapter 3: Simple (bivariate) regression

Chapter 4: Multiple regression

Chapter 5: Dummy-variable regression

Chapter 6: Interaction/moderation effects using regression

Chapter 7: Linear regression assumptions and diagnostics

Chapter 8: Logistic regression

Chapter 9: Survival analysis

Chapter 10: Multilevel analysis

Chapter 11: Panel data analysis

Chapter 12: Time-series analysis

Chapter 13: Exploratory factor analysis

Chapter 14: Structural equation modelling and confirmatory factor analysis

Chapter 15: Advanced statistical techniques

Chapter 16: Programming and dynamic reporting using Stata

Document Information

Connected Book

Complete Test Bank | Applied Stats Using Stata 2e | Answers

Explore recommendations drawn directly from what you're reading

$24.99

Quick Navigation

Benefits

2nd Edition Docx Test Bank Applied Stats 2e Test Bank

Chapter 1: Research and statistics

Chapter 2: Introduction to Stata

Chapter 3: Simple (bivariate) regression

Chapter 4: Multiple regression

Chapter 5: Dummy-variable regression

Chapter 6: Interaction/moderation effects using regression

Chapter 7: Linear regression assumptions and diagnostics

Chapter 8: Logistic regression

Chapter 9: Survival analysis

Chapter 10: Multilevel analysis

Chapter 11: Panel data analysis

Chapter 12: Time-series analysis

Chapter 13: Exploratory factor analysis

Chapter 14: Structural equation modelling and confirmatory factor analysis

Chapter 15: Advanced statistical techniques

Chapter 16: Programming and dynamic reporting using Stata

Document Information

Connected Book

Complete Test Bank | Applied Stats Using Stata 2e | Answers

Explore recommendations drawn directly from what you're reading

$24.99

Quick Navigation

Benefits

Report Unauthorized Use

Added to Cart