Homework Seven. What do you think?
a. The scatterplot is:

About 90% of the variation is explained by the regression on the New Mexico data nad 88.4% of the variation is explained by the regression on the United States data. The slopes of the two trend lines appear to be different. One problem with these trend lines (if extended out into the future) is that they will eventually cross the x-axis indicating a negative fatality rate–an impossible result.
b. The regression statistics for the New Mexico data are:
|
Regression Statistics |
|
|
|
|
|
|
|
Multiple R |
0.952173153 |
|
|
|
|
|
|
R Square |
0.906633714 |
|
|
|
|
|
|
Adjusted R Square |
0.904176706 |
|
|
|
|
|
|
Standard Error |
0.897559381 |
|
|
|
|
|
|
Observations |
40 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ANOVA |
|
|
|
|
|
|
|
|
df |
SS |
MS |
F |
Significance F |
|
|
Regression |
1 |
297.270462 |
297.270462 |
368.9992 |
3.65574E-21 |
|
|
Residual |
38 |
30.61328799 |
0.805612842 |
|
|
|
|
Total |
39 |
327.88375 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Coefficients |
Standard Error |
t Stat |
P-value |
Lower 95% |
Upper 95% |
|
Intercept |
13.15384615 |
0.28924003 |
45.47726724 |
9.6E-35 |
12.5683103 |
13.73938 |
|
Year |
-0.236163227 |
0.012294181 |
-19.20935085 |
3.66E-21 |
-0.261051495 |
-0.21127 |
The regression statistics for the United States data are:
|
Regression Statistics |
|
|
|
|
|
|
|
Multiple R |
0.940149084 |
|
|
|
|
|
|
R Square |
0.883880299 |
|
|
|
|
|
|
Adjusted R Square |
0.880824518 |
|
|
|
|
|
|
Standard Error |
0.67990177 |
|
|
|
|
|
|
Observations |
40 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ANOVA |
|
|
|
|
|
|
|
|
df |
SS |
MS |
F |
Significance F |
|
|
Regression |
1 |
133.7098762 |
133.7098762 |
289.2485 |
2.33234E-19 |
|
|
Residual |
38 |
17.56612383 |
0.462266417 |
|
|
|
|
Total |
39 |
151.276 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Coefficients |
Standard Error |
t Stat |
P-value |
Lower 95% |
Upper 95% |
|
Intercept |
8.736923077 |
0.219099497 |
39.87650913 |
1.28E-32 |
8.293379319 |
9.180466835 |
|
Year |
-0.158386492 |
0.009312849 |
-17.0073078 |
2.33E-19 |
-0.17723937 |
-0.139533613 |
The residual plot for the New Mexico data is:

The residual plot for the United States data is:

There is some indication in these plots that the regression assumption that residuals should be independent has been violated. There appears to be some time factor at which in the residual plot.
c. New Mexico: Durbin-Watson = 0.719. Runs = 11. Runs p-value = 0.001
United States: Durbin-Watson = 0.270 Runs = 5. Runs p-value < 0.0001
The Durbin-Watson statistics are close to 0 for the United States data and for both sets of data the p-value of the runs test is significant indicating fewer runs that would be expected. This would cause us to doubt that the assumption of indepence of residuals has been met.