Multiple Regression in Practice-SPSS

Variables Entered/Removeda
Model Variables Entered Variables Removed Method
1 AGE OF RESPONDENT, HIGHEST YEAR OF SCHOOL COMPLETEDb . Enter
a. Dependent Variable: R’s socioeconomic index (2010)
b. All requested variables entered.
Model Summaryb
Model R R Square Adjusted R Square Std. Error of the Estimate Durbin-Watson
1 .599a .359 .358 17.9504 1.955
a. Predictors: (Constant), AGE OF RESPONDENT, HIGHEST YEAR OF SCHOOL COMPLETED
b. Dependent Variable: R’s socioeconomic index (2010)
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 435840.770 2 217920.385 676.314 .000b
Residual 778800.487 2417 322.218    
Total 1214641.257 2419      
a. Dependent Variable: R’s socioeconomic index (2010)
b. Predictors: (Constant), AGE OF RESPONDENT, HIGHEST YEAR OF SCHOOL COMPLETED
Coefficientsa
Model Unstandardized Coefficients Standardized Coefficients t Sig. Collinearity Statistics
B Std. Error Beta Tolerance VIF
1 (Constant) -22.552 2.009   -11.227 .000    
HIGHEST YEAR OF SCHOOL COMPLETED 4.287 .120 .584 35.863 .000 .999 1.001
AGE OF RESPONDENT .193 .021 .148 9.095 .000 .999 1.001
a. Dependent Variable: R’s socioeconomic index (2010)
Collinearity Diagnosticsa
Model Dimension Eigenvalue Condition Index Variance Proportions
(Constant) HIGHEST YEAR OF SCHOOL COMPLETED AGE OF RESPONDENT
1 1 2.894 1.000 .00 .01 .01
2 .085 5.830 .02 .17 .81
3 .020 11.890 .97 .83 .18
a. Dependent Variable: R’s socioeconomic index (2010)
Residuals Statisticsa
  Minimum Maximum Mean Std. Deviation N
Predicted Value -18.302 80.195 46.004 13.4229 2420
Std. Predicted Value -4.791 2.547 .000 1.000 2420
Standard Error of Predicted Value .366 1.797 .605 .183 2420
Adjusted Predicted Value -18.932 80.148 46.001 13.4316 2420
Residual -48.8160 79.8744 .0000 17.9430 2420
Std. Residual -2.719 4.450 .000 1.000 2420
Stud. Residual -2.722 4.470 .000 1.000 2420
Deleted Residual -48.9185 80.5965 .0027 17.9686 2420
Stud. Deleted Residual -2.726 4.487 .000 1.001 2420
Mahal. Distance .007 23.254 1.999 2.177 2420
Cook’s Distance .000 .060 .000 .002 2420
Centered Leverage Value .000 .010 .001 .001 2420
a. Dependent Variable: R’s socioeconomic index (2010)
  1. What is your research question? Respondents’ highest level of school completed and their age may possibly cause an impact in their level of socioeconomic index (2010).
  2. Interpret the coefficients for the model, specifically commenting on the dummy variable. I am estimating a multiple regression model using respondent’s socioeconomic status index as the dependent variable, respondent’s highest year of education as an independent variable, and the age of respondents as an independent variable. Our coefficient table tells us more information about individual independent variables. Another important consideration to look into is the variance inflation factor. VIF is the number that shows the level of severity of multicollinearity in an ordinary least squares regression analysis (Warner, 2012). Values within 10 and above 10 indicate serious multicollinearity or high probability of correlation in the model. However, 1.001 for both the predictors indicate normal level of correspondence or assumption.
  3. Run diagnostics for the regression model. Does the model meet all of the assumptions? Be sure and comment on what assumptions were not met and the possible implications. Is there any possible remedy for one the assumption violations? After analyzing and reviewing all tables and data of the applicable variables, there seem to exhibit no possible violations on the assumptions of the resulted data. Therefore, all assumptions were possibly made. In my Model Summary table, the Durbin-Watson statistic, which tells us about the independence of errors (Laureate Education, 2016j ), is showing a value of 1.955. This value is an example of an absolute absent of correlation between the residuals (Laureate Education, 2016j). The Anova table is showing the overall statistical significant of the calculated variables. In this case, we have a statistical significant of 0.000, indicating the rejection of the null hypothesis when conventional P-value is set to P<0.05.Our Cook’s distance shows an unnecessary relationship on the model ranging from 0.0- 0.025, with value of 1.0 or greater showing possible influence of correlation.  Our scatter plot provides uniformities of display of homoscedasticity, and give more details about linearity relationship (Wagner, 2016). The histogram indicates how the distribution of correlation or no errors exists. Looking at the histogram, the distribution display of the frequency and regression standardized residual shows an insignificant deviation from normalcy.

Reference

Laureate Education (Producer). (2016j). Regression diagnostics, model evaluation, and dummy variables [Video file]. Baltimore, MD: Author.

Wagner, W. E. (2016). Using IBM® SPSS® statistics for research methods and social science statistics (6th ed.). Thousand Oaks, CA: Sage Publications

Warner, R. M. (2012). Applied Statistics from bivariate through multivariate techniques (2nd ed.). Thousand Oaks, CA: Sage Publications.