Model Building
Y | X1 | X2 | X3 | X4 |
27 | 20 | 50 | 75 | 15 |
23 | 27 | 55 | 60 | 20 |
18 | 22 | 62 | 68 | 16 |
26 | 27 | 55 | 60 | 20 |
23 | 24 | 75 | 72 | 8 |
27 | 30 | 62 | 73 | 18 |
30 | 32 | 79 | 71 | 11 |
23 | 24 | 75 | 72 | 8 |
22 | 22 | 62 | 68 | 16 |
24 | 27 | 55 | 60 | 20 |
16 | 40 | 90 | 78 | 32 |
28 | 32 | 79 | 71 | 11 |
31 | 50 | 84 | 72 | 12 |
22 | 40 | 90 | 78 | 32 |
24 | 20 | 50 | 75 | 15 |
31 | 50 | 84 | 72 | 12 |
29 | 30 | 62 | 73 | 18 |
22 | 27 | 55 | 60 | 20 |
Regression
Notes | ||
Output Created | 30-Nov-2020 14:44:26 | |
Comments |
| |
Input | Active Dataset | DataSet0 |
Filter | <none> | |
Weight | <none> | |
Split File | <none> | |
N of Rows in Working Data File | 18 | |
Missing Value Handling | Definition of Missing | User-defined missing values are treated as missing. |
Cases Used | Statistics are based on cases with no missing values for any variable used. | |
Syntax | REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.06) /NOORIGIN /DEPENDENT y /METHOD=STEPWISE x1 x2 x3 x4.
| |
Resources | Processor Time | 00:00:00.125 |
Elapsed Time | 00:00:00.108 | |
Memory Required | 2524 bytes | |
Additional Memory Required for Residual Plots | 0 bytes |
[DataSet0]
Variables Entered/Removeda | |||
Model | Variables Entered | Variables Removed | Method |
1 | x4 | . | Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-remove >= .060). |
2 | x1 | . | Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-remove >= .060). |
3 | x2 | . | Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-to-remove >= .060). |
a. Dependent Variable: y |
|
Model Summary | ||||
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate |
1 | .522a | .273 | .227 | 3.699 |
2 | .711b | .506 | .440 | 3.150 |
3 | .811c | .657 | .584 | 2.716 |
a. Predictors: (Constant), x4 |
| |||
b. Predictors: (Constant), x4, x1 |
| |||
c. Predictors: (Constant), x4, x1, x2 |
|
ANOVAd | ||||||
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 82.164 | 1 | 82.164 | 6.004 | .026a |
Residual | 218.947 | 16 | 13.684 | |||
Total | 301.111 | 17 |
|
|
| |
2 | Regression | 152.289 | 2 | 76.144 | 7.675 | .005b |
Residual | 148.822 | 15 | 9.921 | |||
Total | 301.111 | 17 |
|
|
| |
3 | Regression | 197.856 | 3 | 65.952 | 8.942 | .001c |
Residual | 103.255 | 14 | 7.375 | |||
Total | 301.111 | 17 | ||||
a. Predictors: (Constant), x4 |
|
|
|
| ||
b. Predictors: (Constant), x4, x1 |
|
|
|
| ||
c. Predictors: (Constant), x4, x1, x2 |
|
|
| |||
d. Dependent Variable: y |
|
|
|
|
Coefficientsa | ||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 30.253 | 2.399 |
| 12.613 | .000 |
x4 | -.324 | .132 | -.522 | -2.450 | .026 | |
2 | (Constant) | 24.456 | 2.988 |
| 8.186 | .000 |
x4 | -.383 | .115 | -.617 | -3.336 | .005 | |
x1 | .225 | .084 | .492 | 2.659 | .018 | |
3 | (Constant) | 30.736 | 3.608 |
| 8.519 | .000 |
x4 | -.401 | .099 | -.647 | -4.042 | .001 | |
x1 | .435 | .112 | .951 | 3.896 | .002 | |
x2 | -.181 | .073 | -.598 | -2.486 | .026 | |
a. Dependent Variable: y |
|
|
|
|
Excluded Variablesd | ||||||
Model | Beta In | t | Sig. | Partial Correlation | Collinearity Statistics | |
Tolerance | ||||||
1 | x1 | .492a | 2.659 | .018 | .566 | .963 |
x2 | .112a | .510 | .618 | .131 | .990 | |
x3 | .079a | .360 | .724 | .093 | .996 | |
2 | x2 | -.598b | -2.486 | .026 | -.553 | .423 |
x3 | -.062b | -.318 | .755 | -.085 | .917 | |
3 | x3 | .224c | 1.166 | .265 | .308 | .648 |
a. Predictors in the Model: (Constant), x4 |
|
| ||||
b. Predictors in the Model: (Constant), x4, x1 |
|
| ||||
c. Predictors in the Model: (Constant), x4, x1, x2 |
| |||||
d. Dependent Variable: y |
|
|
|
Developing multiple linear regression
Table 1 – Correlation matrix between the response variable and explanatory variables
| Y | X1 | X2 | X3 | X4 | |
Y | Pearson Correlation | 1 | .373 | .059 | .048 | -.522* |
Sig. (2-tailed) |
| .127 | .815 | .852 | .026 | |
X1 | Pearson Correlation | .373 | 1 | .758** | .288 | .192 |
Sig. (2-tailed) | .127 |
| .000 | .247 | .444 | |
X2 | Pearson Correlation | .059 | .758** | 1 | .555* | .099 |
Sig. (2-tailed) | .815 | .000 |
| .017 | .697 | |
X3 | Pearson Correlation | .048 | .288 | .555* | 1 | .060 |
Sig. (2-tailed) | .852 | .247 | .017 |
| .813 | |
X4 | Pearson Correlation | -.522* | .192 | .099 | .060 | 1 |
Sig. (2-tailed) | .026 | .444 | .697 | .813 |
| |
*. Correlation is significant at the 0.05 level (2-tailed). | ||||||
**. Correlation is significant at the 0.01 level (2-tailed). |
RESULTS IN TABLE INDICATE THE FOLLOWINGS.
· OF THE FOUR EXPLANATORY VARAIBLES ONLY THE PAIRS OF X1 & X2 (r=.758, p =0.0) and x2 & x3 (r=.555, p=.017) are significantly correlated.
· THE RESPONSE VARIABLE IS SIGNIFICANTLY CORELATED ONLY WITH X4 (r= -.522, p=.026)
Table 2 – Useful statistical indicators of observed variables
| Minimum | Maximum |
| |
Mean | Std. Error | |||
Y | 16 | 31 | 24.78 | .992 |
X1 | 20 | 50 | 30.22 | 2.172 |
X2 | 50 | 90 | 68.00 | 3.278 |
X3 | 60 | 78 | 69.89 | 1.425 |
X4 | 8 | 32 | 16.89 | 1.598 |
· The response variable varies between 16(min) and 31(max) with a mean of 24.8 and SE of mean of .992, confident interval(at 95% confident) = 24.8 – 2*.992, 24.8+2*.992
· X1 varies between 20 (min) and 50 (max) with a mean of 30.2 and SE of mean 2.2
· X2 varies between 50 (min) and 90 (max) with a mean of 68.00 and SE of mean 3.3
· X3 varies between 60 (min) and 78 (max) with a mean of 69.9 and SE of mean 1.4
· X4 varies between 8 (min) and 32 (max) with a mean of 16.9 and SE of mean 1.6
Model Building
(normally we start with the highest correlation variable with response)
As X4 has the highest correlation with Y the first model was developed with X4. The ANOVA table is shown in below.
Table 3
ANOVA table for the linear model with X4 | ||||||
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 82.164 | 1 | 82.164 | 6.004 | .026 |
Residual | 218.947 | 16 | 13.684 |
|
| |
Total | 301.111 | 17 |
|
|
|
(R2 – 27.3%)
The model explains only 27% of the observed variability. That is about 73% has not been explained by the linear model with X4
(then we select the next highest correlation variable with response)
Now we include X1 into the model and fitted a linear model. The ANOVA table is shown in Table 4 and the properties of estimators are shown in Table 5
Table 4
ANOVA for the linear model of Y with X4 & X1 | ||||||
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 152.289 | 2 | 76.144 | 7.675 | .005 |
Residual | 148.822 | 15 | 9.921 |
|
| |
Total | 301.111 | 17 |
|
|
| |
(R2 = 50.6%) |
Table 5
Properties of the estimator of fitted model | ||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 24.456 | 2.988 |
| 8.186 | .000 |
X4 | -.383 | .115 | -.617 | -3.336 | .005 | |
X1 | .225 | .084 | .492 | 2.659 | .018 | |
RESULTS IN TABLE 5 INDICATE THAT BOTH PARAMETERS ARE SIGNIFICANT; |
SS (X4) = 82
SS (X4, X1) = 152
SEQ. SS OF X1 WHEN THE MODEL HAS X1 = 152-82=70
HO: ADDING X1 IN TO THE MODEL HAVING X4 IS NOT SIGNIFICANT.
TEST STATISTIC: F = 70/MSE OF THE FULL MODEL
= 70/9.92 = 7.06 ~ F (1, 15)
THIS IS SIG.
IT CAN BE CONCLUDED WITH 95% CONFIDENT THAT THE INCLUSION OF X1 IN TO MODEL HAVING X4 IS SIGNIFICANT.
NOW WE INCLUDE X2
THE ANOVA WITH X4, X1, X2
Table 6
ANOVA FOR THE LINEAR MODEL OF Y WITH X4,X1,X2 | ||||||
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 197.856 | 3 | 65.952 | 8.942 | .001 |
Residual | 103.255 | 14 | 7.375 |
|
| |
Total | 301.111 | 17 |
|
|
| |
(R2 = 65.7%) |
Table 7
-Coefficients | ||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 30.736 | 3.608 |
| 8.519 | .000 |
X4 | -.401 | .099 | -.647 | -4.042 | .001 | |
X1 | .435 | .112 | .951 | 3.896 | .002 | |
X2 | -.181 | .073 | -.598 | -2.486 | .026 | |
|
MODEL IS SIGNIFICANT AND ALSO 3 PARAMETERS ARE SIGNIFICANT.
HO: THE INCLUSION OF X2 IN TO THE MODEL HAVING X4 AND X1 IS NOT SIGNIFICANT.
PARTTAIL F TEST -
TEST STAT = (198-152)/7.37 ~ F (1, 14) THIS ONE ALSO SIGNIFICANT.
WHEN WE INCLUDE X3
Table 8
ANOVA | ||||||
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 207.625 | 4 | 51.906 | 7.218 | .003 |
Residual | 93.486 | 13 | 7.191 |
|
| |
Total | 301.111 | 17 |
|
|
| |
(R2 = 69%) |
MODEL IS SIGNIFICANT.
Table 9
PROPERTIES OF THE ESTIMATOR OF THE MODEL | ||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 22.625 | 7.819 |
| 2.894 | .013 |
X4 | -.407 | .098 | -.656 | -4.150 | .001 | |
X1 | .468 | .114 | 1.024 | 4.112 | .001 | |
X2 | -.235 | .086 | -.777 | -2.747 | .017 | |
X3 | .156 | .134 | .224 | 1.166 | .265 |
PARTIAL F TEST:
HO: INCLUSION OF X3 IN TO THE MODEL HAVING X4, X1 AND X 2 IS NOT SIGNIFICANT.
TEST STAT = (208-198)/7.2 = 10/7.2 < 2 ~ F (1, 13)
NOT SIGNIFICANT
HO IS ACCEPTED
THE MODEL WITH 4 PARAMETERS IS NOT ACCEPTED.(even model is significant)
THE BEST MODEL IS Y WITH X4, X1, X2.
THE ORDER OF INLCUSION OF VARAIBLES IS IMMETERARIAL.
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 30.736 | 3.608 |
| 8.519 | .000 |
X4 | -.401 | .099 | -.647 | -4.042 | .001 | |
X1 | .435 | .112 | .951 | 3.896 | .002 | |
X2 | -.181 | .073 | -.598 | -2.486 | .026 |
(When we fit the model by changing the variable order, it does not differ with our best model.)
Coefficientsa | ||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 30.736 | 3.608 |
| 8.519 | .000 |
X1 | .435 | .112 | .951 | 3.896 | .002 | |
X2 | -.181 | .073 | -.598 | -2.486 | .026 | |
X4 | -.401 | .099 | -.647 | -4.042 | .001 |
Coefficientsa | ||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 30.736 | 3.608 |
| 8.519 | .000 |
X2 | -.181 | .073 | -.598 | -2.486 | .026 | |
X4 | -.401 | .099 | -.647 | -4.042 | .001 | |
X1 | .435 | .112 | .951 | 3.896 | .002 |
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 22.625 | 7.819 |
| 2.894 | .013 |
X4 | -.407 | .098 | -.656 | -4.150 | .001 | |
X1 | .468 | .114 | 1.024 | 4.112 | .001 | |
X2 | -.235 | .086 | -.777 | -2.747 | .017 | |
X3 | .156 | .134 | .224 | 1.166 | .265 |
Coefficientsa | ||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 22.625 | 7.819 |
| 2.894 | .013 |
X2 | -.235 | .086 | -.777 | -2.747 | .017 | |
X1 | .468 | .114 | 1.024 | 4.112 | .001 | |
X3 | .156 | .134 | .224 | 1.166 | .265 | |
X4 | -.407 | .098 | -.656 | -4.150 | .001 |
THE FINAL (BEST FITTED MODEL) IS,
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 30.736 | 3.608 |
| 8.519 | .000 |
X4 | -.401 | .099 | -.647 | -4.042 | .001 | |
X1 | .435 | .112 | .951 | 3.896 | .002 | |
X2 | -.181 | .073 | -.598 | -2.486 | .026 |
Interpretation of the model
Y = 30.736 + 0.435(X1) – 0.181(X2) – 0.401(X4)
. apart from that this point 3 means,
· A unit increase of X1 helps to increase Y by 0.435 when X2 & X4 are fixed.
· A unit increase of X4 helps to decrease Y by 0.401 unit when X1 & X2 are fixed.
· A unit increase of X2 helps to decrease Y by 0.181 unit when X1 & X4 are fixed.
(These are based on Unstandardized Coefficients)
X4 | -.647 |
X1 | .951 |
X2 | -.598 |
We can compare the impact of these variables. Comparisons of the Standardized Coefficients confirm that of the 3 significant variables.X1 is more conferential on Y than X4 & X2.
(These are based on Standardized Coefficients)
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate |
1 | .811a | .657 | .584 | 2.716 |
R2 = 65.7%
The model explain 65.7% of the observed variability explained.
Adj R2 = 58.4%
In a good model R2 should be close to Adj R2
Adj R2 does not imply the % of the observed variability explained.
(In the multiple regressions we used both)
Model diagnostics
· Random
· Constant variance
· Normality
· Mean zero
Model Summary | |||||
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate | Durbin-Watson |
1 | .811a | .657 | .584 | 2.716 | 2.513 |
Coefficients | ||||||||||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | Collinearity Statistics |
| ||||||||
B | Std. Error | Beta |
|
| Tolerance | VIF |
| |||||||
1 | (Constant) | 30.736 | 3.608 |
| 8.519 | .000 |
|
| ||||||
X4 | -.401 | .099 | -.647 | -4.042 | .001 | .958 | 1.044 |
| ||||||
X1 | .435 | .112 | .951 | 3.896 | .002 | .411 | 2.434 |
| ||||||
X2 | -.181 | .073 | -.598 | -2.486 | .026 | .423 | 2.367 |
| ||||||
VIF – VARAINCE INFLATION FACTOR
TO TEST THE IMPACT OF MULTI COLLINERALTY (SIG. CORRELATION AMONG EXPLANTORY VARAIBLES)
NOTE: VIF < 10 TO IGNORE THE IMPACT OF MULTI COLLINERALRTY FOR THE FITTED MODELS.
CONCLUSIONS
DW IS CLOSE TO 2 (outliers) THUS EROORS ARE RANDOM.
VIF FOR ALL 3 VARABLE IN THE MODEL IS LESS THAN 10.THUS NO SIGNIFICANT FROM VARIABLES TO THE MODEL.
PLOT OF PREDICTED VS RESIDUALS LOOKS A RANDOM NATURE. IT CONFIRM ERROR HAVE CONSTANT VARAINCE.
Tests of Normality | ||||||
| Kolmogorov-Smirnova | Shapiro-Wilk | ||||
Statistic | df | Sig. | Statistic | df | Sig. | |
Unstandardized Residual | .168 | 18 | .193 | .949 | 18 | .405 |
S.W. TEST STATISTICS IS NOT SIGNIFICANT AS THE CORRESPONDING P VALUE IS GRATER THANT 5%.
IT CONFIRMS THAT DISTRIBUTIOIN OF ERRORS ARE NOT SIGNIFICANTLY DEVIATE FROM THE NORMA DISTRIBUTUIOIN.
95% CONFIDENT INTAVAL OF ERROR MEAN CONATINS ZERO.
IT CONFIRMS MEAN OF THE ERRORS IS NOT SIGNIFICANTLY DEVIATE FROM ZERO.
SINCE ALL FOURS CONDITIONS SATISFIED ERRPRS OF THE FITTED MODEL WE CAN THAT ERROR WHITE NOISE.
THUS FITTED MODEL CAN BE ACCPTED.
(TO CONLCUCDE ABOUT FORECASTING BETTER TO CHECK % ERROR)
Devoloping nonlinear model.
x | y | |
1 | 1989 | 3 |
2 | 1990 | 4.2 |
3 | 1991 | 5 |
4 | 1992 | 10 |
5 | 1993 | 14 |
6 | 1994 | 28 |
7 | 1995 | 30 |
8 | 1996 | 45 |
9 | 1997 | 58 |
10 | 1998 | 60.1 |
11 | 1999 | 84.3 |
12 | 2000 | 87 |
Not linear. Looks likes exponential. (We can assume y = a * eb (t) )
ln (y) =ln (a) + bt
so we have to create ln(y) values,
Then it became as a linear model.
· Now we compare both ANOVA tables.
1st case y = a + b(t)
ANOVA for y = a + b(t) | ||||||
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 9783.328 | 1 | 9783.328 | 158.653 | .000 |
Residual | 616.649 | 10 | 61.665 |
|
| |
Total | 10399.977 | 11 |
|
|
|
Model Summary | |||||
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate | Durbin-Watson |
1 | .970a | .941 | .935 | 7.85270 | .969 |
In this R2 is very high, but the DW is very low that means the errors are not random.
2nd case ln (y) =ln (a) + b(t)
ANOVA for ln (y) =ln (a) + b(t) | ||||||
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 15.382 | 1 | 15.382 | 237.466 | .000 |
Residual | .648 | 10 | .065 |
|
| |
Total | 16.030 | 11 |
|
|
|
Model Summary | |||||
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate | Durbin-Watson |
1 | .980a | .960 | .956 | .25451 | .838 |
· In this case also we can see the low DW. Though R2 is high.
· Note: data is time series data. Most time series data are not independent. In regression we assume that Y is independent.
· In time y’s are dependent.
There is a dependent structure. That is, {y1, y2, , , yt} yt depends on yt-
1. This is known as autocorrelation.
Coefficients | ||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | .925 | .157 |
| 5.905 | .000 |
t | .328 | .021 | .980 | 15.410 | .000 |
ln y = 0.925 +0.328*(t)
Y =a*e(bt) ln y = ln(a) + b(t)
ln(a) = 0.925 b = 0.328
a = exp(0.925)=2.521
Y = 2.521* e (0.328*t)
In these two type don’t compare R2 values. those are 2 models.
EXAMPLES
4.3.1 Example 1
Price | Sales (observed) |
70 | 37 |
65 | 70 |
60 | 110 |
55 | 250 |
50 | 288 |
45 | 460 |
40 | 742 |
35 | 1220 |
30 | 1800 |
25 | 3340 |
20 | 5200 |
We plot these data.
According to this we can assume the data like exponential model (sales =a*e (b*price) ).
Now it transform into , ln(sales)=ln(a)+b*price
Then we can fit linear model (ln(sales) vs price)
Outputs,
Model Summary | |||||
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate | Durbin-Watson |
1 | .997a | .994 | .994 | .12481 | 2.040 |
ANOVA | ||||||
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 25.239 | 1 | 25.239 | 1.620E3 | .000a |
Residual | .140 | 9 | .016 |
|
| |
Total | 25.379 | 10 |
|
|
|
Coefficients | ||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 10.471 | .114 |
| 92.235 | .000 |
Price(x) | -.096 | .002 | -.997 | -40.251 | .000 |
Based on the results the fitted model is:
ln(sales) = 10.471-0.096*Price
The transpose of this into the exponential model, y = sales = 35277.5 * e(−0.096*Price)
4.3.2 Example 2
This example plot KMPL vs HP. The scatter plot become,
According to this pattern we can’t fit the exponential model. So for a linear model we have to plotted KMPL vs 1/HP (KMPL = a +b*1/HP)
Then it became,
Model Summary | |||||
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate | Durbin-Watson |
1 | .895a | .800 | .799 | 2.931 | 1.345 |
ANOVA | ||||||
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 4986.987 | 1 | 4986.987 | 580.478 | .000a |
Residual | 1245.721 | 145 | 8.591 |
|
| |
Total | 6232.707 | 146 |
|
|
|
Coefficients | ||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 13.631 | .649 |
| 20.994 | .000 |
1/hp | 2692.467 | 111.753 | .895 | 24.093 | .000 |
So we can fit the model as, KMPL=13.631+2692.467*(1 / HP)
If we fitted for the raw data we could found,
KMPL = 38.73 - 0.048*HP
[R2 = 59%, SE of the estimate = 4.175, DW = 1.4]
4.3.3 Example 3
Country | Imports (IMP) | GDP | Country | Imports (IMP) | GDP |
1 | 20.3 | 391 | 13 | 30.8 | 122 |
2 | 68 | 528 | 14 | 3.1 | 9.8 |
3 | 1.5 | 21.4 | 15 | 292.1 | 3550 |
4 | 57.7 | 1340 | 16 | 0.17 | 3.6 |
5 | 229 | 923 | 17 | 76.9 | 200 |
6 | 4.8 | 25.9 | 18 | 2 | 12.9 |
7 | 47.9 | 155.5 | 19 | 201.1 | 434 |
8 | 164 | 258 | 20 | 13.7 | 105.9 |
9 | 31.8 | 136.2 | 21 | 6.7 | 16.9 |
10 | 303.7 | 1540 | 22 | 0.9 | 0.62 |
11 | 31.4 | 201.1 |
| ||
12 | 0.98 | 12 |
|
Scatter Plot of IMP vs GDP
Based on this scatter plot we fit a linear model. Therefore we consider power model for this data.
(After fitted other types we can consider this one)
So,
Y = a * x (b) by transforming, log(y) = log(a) + b* log(x)
(In econometric studies and it is known as log - log model.)
By taking log values for both x & y we can plot a scatter plot. And we can fit a linear model for this.
Model Summary | |||||
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate | Durbin-Watson |
1 | .915a | .836 | .828 | .38503 | 1.647 |
ANOVA | ||||||
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 15.166 | 1 | 15.166 | 102.302 | .000a |
Residual | 2.965 | 20 | .148 |
|
| |
Total | 18.131 | 21 |
|
|
|
Coefficients | ||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | -.537 | .195 |
| -2.751 | .012 |
Log(GDP) | .900 | .089 | .915 | 10.114 | .000 |
So we can have the log model,
Log(IMP) = -0.537 +0.9*Log(GDP)
By transposing this we can have the model for forecasting,
IMP = 0.2905*GDP(0.9)
0 Comments