It also suggests that there are no unusual data points in the data set. And, it illustrates that the variation around the estimated regression line is constant suggesting that the assumption of equal error variances is reasonable. Here's what the corresponding residuals versus fits plot looks like for the data set's simple linear regression model with arm strength as the response and level of alcohol consumption as the predictor:.
Note that, as defined, the residuals appear on the y axis and the fitted values appear on the x axis. You should be able to look back at the scatter plot of the data and see how the data points there correspond to the data points in the residual versus fits plot here. In case you're having trouble with doing that, look at the five data points in the original scatter plot that appear in red. Note that the predicted response fitted value of these men whose alcohol consumption is around 40 is about Also, note the pattern in which the five data points deviate from the estimated regression line.
Now look at how and where these five data points appear in the residuals versus fits plot. Do you see the connection? Incidentally, did you notice that the r 2 value is very high This is an excellent example of the caution "a large r 2 value should not be interpreted as meaning that the estimated regression line fits the data well. The residuals vs. The Answer: Non-constant error variance shows up on a residuals vs. An Example: How is plutonium activity related to alpha particle counts?
Plutonium emits subatomic particles — called alpha particles. Devices used to detect plutonium record the intensity of alpha particle strikes in counts per second.
The following fitted line plot was obtained on the resulting data alphapluto. The plot suggests that there is a linear relationship between alpha count rate and plutonium activity.
It also suggests that the error terms vary around the regression line in a non-constant manner — as the plutonium level increases, not only does the mean alpha count rate increase, but also the variance increases. That is, the fitted line plot suggests that the assumption of equal variances is violated. The residual vs.
The Answer: The observation's residual stands apart from the basic random pattern of the rest of the residuals. The random pattern of the residual plot can even disappear if one outlier really deviates from the pattern of the rest of the data. An Example: Is there a relationship between tobacco use and alcohol use? The British government regularly conducts surveys on household spending.
The fitted line plot of the resulting data alcoholtobacco. In fact, the outlier is so far removed from the pattern of the rest of the data that it appears to be "pulling the line" in its direction. Note that Northern Ireland's residual stands apart from the basic random pattern of the rest of the residuals.
That is, the residual vs. Incidentally, this is an excellent example of the caution that the "coefficient of determination r 2 can be greatly affected by just one data point. Removing Northern Ireland's data point from the data set, and refitting the regression line, we obtain:. At first glance, the scatterplot appears to show a strong linear relationship. However, when we examine the residual plot, we see a clear U-shaped pattern.
Looking back at the scatterplot, this movement of the data points above, below and then above the regression line is noticeable. The residual plot, particularly when graphed at a finer scale, helps us to focus on this deviation from linearity. The pattern in the residual plot suggests that our linear model may not be appropriate because the model predictions will be too high for values in the middle of the range of the explanatory variable and too low for values at the two ends of that range.
A model with a curvilinear form may be more appropriate. In the residual plot, we see that residuals grow steadily larger in absolute value as we move from left to right. In other words, as we move from left to right, the observed values deviate more and more from the predicted values.
Again, we have chosen a smaller vertical scale for the residual plot to help amplify the pattern to make it easier to see. The pattern in the residual plot suggests that predictions based on the linear regression line will result in greater error as we move from left to right through the range of the explanatory variable.
Note that the residuals are fairly randomly dispersed. However, they seem to be a bit more spread out on the left and right than they are in the middle. As we look at higher ages, there seems to be greater variation in the residuals, which suggests that we may want to be more cautious if we are trying to predict distances for older drivers. When the sum of the residuals is greater than zero, the data set is nonlinear. A random pattern of residuals supports a linear model.
A random pattern of residuals supports a nonlinear model. The correct answer is B. A random pattern of residuals supports a linear model; a non-random pattern supports a nonlinear model. The sum of the residuals is always zero, whether the data set is linear or nonlinear.
0コメント