5 Path Models for Linear Regression

where is a constant intercept term, the regression or `structural' coefficient, and the residual error term or disturbance term, which is uncorrelated with . This is indicated by the absence of a double-headed arrow between and or an indirect common cause between them [Cov(,) = 0]. The double-headed arrow from to itself represents the variance of this variable: Var() = ; the variance of is Var() = . In this example SBP is the dependent variable and sodium intake is the independent variable. We can extend the model by adding more independent variables or more dependent variables or both. The path diagram in Figure 5.2b represents a multiple regression model, such as might be used if we were trying to predict SBP () from sodium intake (), exercise (), and body mass index [BMI] (), allowing once again for the influence of other residual factors () on blood pressure. The double-headed arrows between the three independent variables indicate that correlations are allowed between sodium intake and exercise (), sodium intake and BMI (), and BMI and exercise (). For example, a negative covariance between exercise and sodium intake might arise if the health-conscious exercised more and ingested less sodium; positive covariance between sodium intake and BMI could occur if obese individuals ate more (and therefore ingested more sodium); and a negative covariance between BMI and exercise could exist if overweight people were less inclined to exercise. In this case the regression equation is

Note that the estimated values for , and will not usually be the same as in equation 5.1 due to the inclusion of additional independent variables in the multiple regression equation 5.2. Similarly, the only difference between Figures 5.2a and 5.2b is that we have multiple independent or predictor variables in Figure 5.2b. Figure 5.2c represents a multivariate regression model, where we now have two dependent variables (blood pressure, , and a measure of coronary artery disease [CAD], ), as well as the same set of independent variables (case 1). The model postulates that there are direct influences of sodium intake and exercise on blood pressure, and of exercise and BMI on CAD, but no direct influence of sodium intake on CAD, nor of BMI on blood pressure. Because the variable, exercise, causes both blood pressure, , and coronary artery disease, , it is termed a

and

Here and are the intercept term and error term, respectively, and and the regression coefficients for predicting blood pressure, and , , , and the corresponding coefficients for predicting coronary artery disease. We can rewrite equation 5.3 using matrices (see Chapter 4 on matrix algebra),

or, using matrix notation,

where

and

In matrix form, we may write these equations as

i.e.,

Now that some examples of regression models have been described both in the form of path diagrams and structural equations, we can apply the tracing rules of path analysis to derive the expected variances and covariances under the models. The regression models presented in this chapter are all examples of unstandardized variables. We illustrate the derivation of the expected variance or covariance between some variables by applying the tracing rules for unstandardized variables in Figures 5.2a, 5.2b and 5.2c. As an exercise, the reader may wish to trace some of the other paths. In the case of Figure 5.2a, to derive the expected covariance between and , we need trace only the path:

yielding an expected covariance of (). Two paths contribute to the expected variance of ,

yielding an expected variance of of ( ). In the case of Figure 5.2b, to derive the expected covariance of and , we can trace paths:

to obtain an expected covariance of ( ). To derive the expected variance of , we can trace paths:

yielding a total expected variance of ( ). In the case of Figure 5.2c, we may derive the expected covariance of and as the sum of

giving [ ] for the expected covariance. This expectation, and the preceding ones, can be derived equally (and arguably more easily) by simple matrix algebra. For example, the expected covariance matrix () for and under the model of Figure 5.2c is given as

in which the elements of