Você está na página 1de 15

Assignment 4

Statistical Inference
Regressions and Factor Analysis

Submitted to: Dr. Nadeem Shafique Butt Submitted by: Saad Ullah khan FA12-MSMS-016

Comsats Institute of Information Technology Lahore

Q#1 Simple Linear Regression


Age 53 43 33 45 46 55 41 55 36 45 55 50 49 47 47 Salary 145 621 262 208 362 424 339 736 291 58 498 643 390 332 862 Age 56 44 46 58 48 38 74 60 32 51 50 40 61 63 62 Salary 204 206 250 21 298 350 800 726 370 536 291 808 543 149 396 Age 69 51 48 62 45 37 50 50 50 58 53 57 53 61 48 Salary 750 368 659 234 396 300 343 536 543 217 298 1103 406 254 572 Age 56 45 61 70 59 57 69 44 56 50 56 43 48 52 Salary 350 242 198 213 296 317 482 155 802 200 282 573 388 250

Model Accuracy:
Model Summary Adjusted R Model 1 R .128
a

Std. Error of the Estimate

R Square .016

Square .000

220.64245

1.6% of the variation in salary is explained by age predictor. Significance of Regression Model H0: Model is insignificant. H1: Model is significant. = 0.05 ANOVA F Test
ANOVA Model 1 Regression Residual Total Sum of Squares 45896.028 2774936.278 2820832.305 df 1 57 58
b

Mean Square 45896.028 48683.093

F .943

Sig. .336
a

a. Predictors: (Constant), age b. Dependent Variable: salary

From the above ANOVA table, we can see that p-value is greater than our level of significance so we conclude that our regression model is insignificant at 5% level of significance Regression Equation
Coefficients
a

Standardized Unstandardized Coefficients Model 1 (Constant) age a. Dependent Variable: salary B 242.702 3.133 Std. Error 168.760 3.226 .128 Coefficients Beta t 1.438 .971 Sig. .156 .336

Salary (in thousand dollars) = 242.7 + 3.133(age)

Q#2 Multiple Linear Regression


Site 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 X1 2 3 16.6 7 5.3 16.5 25.89 44.42 39.63 31.92 97.33 56.63 96.67 54.58 113.88 149.58 134.32 188.74 110.24 96.83 102.33 274.92 811.08 384.5 95 X2 4 1.58 23.78 2.37 1.67 8.25 3 159.75 50.86 40.08 255.08 373.42 206.67 207.08 981 233.83 145.82 937 410 677.33 288.83 695.25 714.33 1473.66 368 X3 4 40 40 168 42.5 168 40 168 40 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 X4 1.26 1.25 1 1 7.79 1.12 0 0.6 27.37 5.52 19 6.03 17.86 7.77 24.48 31.07 25.99 45.44 20.05 20.31 21.01 46.63 22.76 7.36 30.26 X5 1 1 1 1 3 2 3 18 10 6 6 4 14 6 6 14 12 26 12 10 14 58 17 24 9 X6 6 5 13 7 25 19 36 48 77 47 165 36 120 66 166 185 192 237 115 302 131 363 242 540 292 X7 6 5 13 8 25 19 36 48 77 47 130 37 120 66 179 202 192 237 115 210 131 363 242 453 196 Y 180.23 182.61 164.38 284.55 199.92 267.38 999.09 1103.24 944.21 931.84 2268.06 1489.5 1891.7 1387.82 3559.92 3115.29 2227.76 4804.24 2628.32 1880.84 3036.63 5539.98 3534.49 8266.77 1845.89

Model Accuracy

Model Summary Adjusted R Model 1 R .980


a

Std. Error of the Estimate

R Square .961

Square .945

455.1672686

a. Predictors: (Constant), X7, X3, X4, X1, X5, X2, X6

98% of the variation in monthly man hours is explained by Average daily occupancy, Monthly average number of check-ins, Weekly hours of service desk operation, Square feet of common area use, Number of building wings, Operational berthing capacity and Number of rooms. Significance of Overall regression H0: Model is insignificant. HA: Model is significant. = 0.05 ANOVA F Test
ANOVA Model 1 Regression Residual Total Sum of Squares 8.739E7 3522013.121 90909201.27 df 7 17 24
b

Mean Square 1.248E7 207177.242

F 60.257

Sig. .000
a

a. Predictors: (Constant), X7, X3, X4, X1, X5, X2, X6 b. Dependent Variable: Y

From the above ANOVA table, we can see that p-value is less than our level of significance so we conclude that our regression model is significant at 5% level of significance. H0(x1): Average daily occupancy predictor is insignificant. HA(x1): Average daily occupancy predictor is significant. H0(x2): Monthly average number of check-ins is insignificant. HA(x2): Monthly average number predictor is significant. H0(x3): Weekly hours of service desk operation predictor are insignificant. HA(x3): Weekly hours of service desk operation predictor is significant.

H0(x4): Square feet of common area use predictor is insignificant. HA(x4): Square feet of common area use predictor is significant. H0(x5): Number of building wings is insignificant. HA(x5): Number of building wings predictor is significant H0(x6): Operational berthing capacity predictor is insignificant. HA(x6): Operational berthing capacity predictor is significant. H0(x7): Number of rooms predictor is insignificant. HA(x7): Number of rooms predictor is significant. = 0.05 T Test for significance of predictors in regression.
B 1 (Constant) X1 X2 X3 X4 X5 X6 X7 134.968 -1.284 1.804 .669 -21.423 5.619 -14.480 29.325 Std. Error 237.814 .805 .516 1.846 10.172 14.746 4.220 6.366 -.112 .355 .020 -.154 .035 -.998 1.755 Beta .568 -1.595 3.494 .362 -2.106 .381 -3.431 4.607 .578 .129 .003 .722 .050 .708 .003 .000

a. Dependent Variable: Y

From the above table, we conclude at 5% level of significance that, Average daily occupancy is insignificant. Monthly average number of check-ins is significant. Weekly hours of service desk operation Square feet of common area use is significant. Number of building wings is insignificant. Operational berthing capacity is significant. Number of rooms is significant.

Regression Equation Monthly Man Hours = 134.96 1.28(Average daily occupancy is insignificant) +1.8 (Monthly average number of check-ins is significant) + .669(Weekly hours of service desk operation) 21.4(Square feet of common area use is significant) + 5.61 (Number of building wings is insignificant) 14.4 (Operational berthing capacity is significant) +29.3 (Number of rooms is significant)

Q#3 Factor Analysis: Managerial skills


Suitability of Factor Analysis
KMO and Bartlett's Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy. Bartlett's Test of Sphericity Approx. Chi-Square df Sig. .849 558.985 55 .000

0.849 value of KMO is near to 1 which shows that factor analysis is suitable. Bartletts Test H0: Factor Analysis is not suitable. HA: Factor Analysis is Suitable. = 0.05 Bartletts Test
KMO and Bartlett's Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy. Bartlett's Test of Sphericity Approx. Chi-Square df Sig. .849 558.985 55 .000

From the above table, we can see that p-value is less than our level of significance so we conclude that factor analysis is suitable at 5% level of significance

Number of Factors
Initial Eigenvalues Component 1 2 3 4 5 6 7 8 9 10 11 Total 4.879 1.606 1.090 .641 .609 .537 .434 .411 .298 .268 .227 % of Variance 44.352 14.597 9.913 5.831 5.534 4.883 3.943 3.737 2.714 2.433 2.061 Cumulative % 44.352 58.949 68.863 74.693 80.228 85.111 89.054 92.792 95.505 97.939 100.000 Extraction Sums of Squared Loadings Total 4.879 1.606 1.090 % of Variance 44.352 14.597 9.913 Cumulative % 44.352 58.949 68.863

Above Eign Values show that suitable number of factors is 3. Factor Construction and Factor Names Rotated Component Matrixa Component 1 I show confidence in my staff I let my staff know they are doing well I give feedback to staff on how well they are working I would personally compliment staff if they did outstanding work I believe in setting goals and achieving them I achieve the things I want to get done in a day I never try to put off until tomorrow what I can finish today I plan the use of my time well I remain clear headed when too many demands are made upon me I rarely overlook important factors when plans are made I handle complex problems efficiently .239 .156 .271 .746 .793 .787 .817 .072 .226 .296 2 .848 .705 .779 .312 .061 .251 .040 .283 .375 .122 3 .270 .162 .195 .153 -.079 .307 .167 .322 .797 .677 .757 -.023 .721

From above rotated component matrix, these 11 variables can be categorized into three factors as following, Feedback I show confidence in my staff I let my staff know they are doing well I give feedback to staff on how well they are working I would personally compliment staff if they did outstanding work Time Management I believe in setting goals and achieving them I achieve the things I want to get done in a day I never try to put off until tomorrow what I can finish today I plan the use of my time well Problem Solving I remain clear headed when too many demands are made upon me I rarely overlook important factors when plans are made I handle complex problems efficiently

Q#4 Factor Analysis: Classroom behavior


Suitability of Factor Analysis
KMO and Bartlett's Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy. Bartlett's Test of Sphericity Approx. Chi-Square df Sig. .956 19654.155 105 .000

.956 value of KMO is near to 1 which shows that factor analysis is suitable. Bartletts Test H0: Factor Analysis is not suitable. HA: Factor Analysis is Suitable. = 0.05 Bartletts Test

KMO and Bartlett's Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy. Bartlett's Test of Sphericity Approx. Chi-Square df Sig. .956 19654.155 105 .000

From the above table, we can see that p-value is less than our level of significance so we conclude that factor analysis is suitable at 5% level of significance Factor Construction and Factor Names Rotated Component Matrixa Component 1 CONCENTRATES CURIOUS PERSEVERES EVEN-TEMPERED PLACID SUSTAINED ATTENTION COMMUNICATIVE RELAXED CALM PURPOSEFUL ACTIVITY COOPERATIVE CONTENTED RELATES-WARMLY COMPLIANT SELF-CONTROLLED Task Oriented CONCENTRATES CURIOUS PERSEVERES SUSTAINED ATTENTION PURPOSEFUL ACTIVITY .778 .858 .861 .240 .096 .770 .405 .422 .259 .806 .362 .268 .328 .234 .398 2 .373 .082 .158 .530 .863 .376 .396 .756 .843 .279 .258 .286 .155 .648 .593 3 .237 .310 .288 .662 .203 .312 .622 .295 .223 .325 .724 .748 .797 .526 .510

Settledness PLACID RELAXED CALM COMPLIANT SELF-CONTROLLED Social-ability EVEN-TEMPERED COMMUNICATIVE COOPERATIVE CONTENTED RELATES-WARMLY

Q#4 Logistic Regression


Exercise 1:
Model Accuracy
Classification Table
a

Predicted Observed verdict Not Guilty Not Guilty verdict Step 1 Guilty Overall Percentage 8 122 93.8 87.3 22 Guilty 13 Percentage Correct 62.9

87.3% variation in verdict is explained by attractiveness, gender, sociable, warmth, kindness, sensitivity and intelligence variable. Overall Model Significance H0: Model is insignificant. HA: Model is significant. = 0.05 Chi Square Test

Omnibus Tests of Model Coefficients Chi-square Step Step 1 Block Model 69.828 69.828 69.828 df 7 7 7 Sig. .000 .000 .000

From the above table, we can see that p-value is less than our level of significance so we conclude that our regression model is significant at 5% level of significance. Significance of each Predictor H0(x1): attractiveness predictor is insignificant. HA(x1): attractiveness predictor is significant. H0(x2): gender predictor is insignificant. HA(x2): gender predictor is significant. H0(x3): sociable predictor is insignificant. HA(x3): sociable predictor is significant. H0(x4): warmth predictor is insignificant. HA(x4): warmth predictor is significant. H0(x5): kindness is predictor insignificant. HA(x5): kindness predictor is significant. H0(x6): sensitivity predictor is insignificant. HA(x6): sensitivity predictor is significant. H0(x7): intelligence predictor is insignificant. HA(x7): intelligence predictor is significant. = 0.05 Wald Test
Variables in the Equation B attract gender sociable Step 1
a

S.E. .323 .520 .543 .210 .207 .207 .153 .225 1.792

Wald .385 5.343 1.456 .457 4.652 3.399 7.769 25.901

df 1 1 1 1 1 1 1 1

Sig. .535 .021 .228 .499 .031 .065 .005 .000

Exp(B) 1.381 .285 1.289 .869 .640 .754 .534 9118.463

-1.256 .254 -.140 -.446 -.282 -.628 9.118

warmth kind sensitiv intellig Constant

a. Variable(s) entered on step 1: attract, gender, sociable, warmth, kind, sensitiv, intellig.

From the above table, we conclude at 5% level of significance that, Attractiveness is insignificant. Gender is significant.

Sociable is insignificant. Warmth is insignificant. Kindness is significant. Sensitivity is insignificant. Intelligence is significant.

Regression Model Log ( p/1-p) = 9.118 + 0.323(attract) -1.256(gender) + 0.254(sociable) 0.14(warmth) 0.446(kind) 0.282(sensitivity) - 0.628(intelligence)

Exercise 2: Predicting Whether or Not Sexual Harassment Will Be Reported


Model Accuracy
Classification Table Observed reported No No reported Step 1 Yes Overall Percentage a. The cut value is .500 78 91 53.8 58.9 111 Yes 63
a

Predicted Percentage Correct 63.8

58.9% variation in REPORTED is explained by AGE, MARSTAT, FEM, and OFFENSUV variables. Overall Model Significance H0: Model is insignificant. HA: Model is significant. = 0.05 Chi Square Test
Omnibus Tests of Model Coefficients Chi-square Step Step 1 Block Model 35.350 35.350 35.350 df 4 4 4 Sig. .000 .000 .000

From the above table, we can see that p-value is less than our level of significance so we conclude that our regression model is significant at 5% level of significance.

Significance of each Predictor

H0(x1): AGE predictor is insignificant. HA(x1): AGE predictor is significant.

H0(x2): MARSTAT predictor is insignificant. HA(x2): MARSTAT predictor is significant.

H0(x3): FEM predictor is insignificant. HA(x3): FEM predictor is significant.

H0(x4): OFFENSUV predictor is insignificant. HA(x4): OFFENSUV predictor is significant

= 0.05 Wald Test


Variables in the Equation B age marstat Step 1
a

S.E. .013 .234 .015 .094 1.425

Wald 1.298 .098 .217 26.650 1.535

df 1 1 1 1 1

Sig. .255 .754 .641 .000 .215

Exp(B) .986 .929 1.007 1.621 .171

-.014 -.073 .007 .483 -1.765

fem offensuv Constant

a. Variable(s) entered on step 1: age, marstat, fem, offensuv.

From the above table, we conclude at 5% level of significance that, AGE is insignificant. MARSTAT is insignificant. FEM is insignificant. OFFENSUV is significant. Regression Model Log (p/1-p) = - 1.765 0.14(AGE) 0.073(MARSTAT) + 0.007(FEM) + 0.483(OFFENSUV)

Exercise 3: Predicting Who Will Drop-Out of School


Model Accuracy

Classification Table Observed

Predicted dropout No No 210 22 Yes 8 10 Percentage Correct 96.3 31.3 88.0

dropout Step 1 Yes Overall Percentage a. The cut value is .500

87.2% variation in DROPOUT is explained by ADDSC, REPEAT and SOCPROB variables. Overall Model Significance H0: Model is insignificant. HA: Model is significant. = 0.05 Chi Square Test
Omnibus Tests of Model Coefficients Chi-square Step Step 1 Block Model 59.597 59.597 59.597 df 3 3 3 Sig. .000 .000 .000

From the above table, we can see that p-value is less than our level of significance so we conclude that our regression model is significant at 5% level of significance. Significance of each Predictor H0(x1): ADDSC predictor is insignificant.

HA(x1): ADDSC predictor is significant.

H0(x2): REPEAT predictor is insignificant. HA(x2): REPEAT predictor is significant. H0(x3): SOCPROB predictor is insignificant. HA(x3): SOCPROB predictor is significant. = 0.05 Wald Test

Variables in the Equation B addsc Step 1


a

S.E. .029 .019 .616 .481 1.071

Wald 2.288 4.408 30.558 18.690

df 1 1 1 1

Sig. .130 .036 .000 .000

Exp(B) 1.029 3.645 14.285 .010

socprob repeat Constant

1.293 2.659 -4.630

a. Variable(s) entered on step 1: addsc, socprob, repeat.

Step 5: From the above table, we conclude at 5% level of significance that, ADDSC is insignificant. REPEAT is significant. SOCPROB is significant. Regression Model Log (p/1-p) = - 4.630 + 0.029(ADDSC) - 2.659(REPEAT) + 1.293(SOCPROB)

Você também pode gostar