A path diagram usually consists of boxes and circles, which are connected by arrows. Consider the diagram in Figure 5.1 for example.
![]() |
Single-headed arrows
(`paths') are used to define causal relationships in the model, with the
variable at the tail of the arrow causing the variable at the head.
Omission of a path from one variable to another implies that there is
no direct causal influence of the former variable on the latter. In
the path diagram in (Figure 5.1) D is determined by
and
, while
is determined by
and
. When two variables
cause each other, we say that there is a feedback-loop, or `reciprocal causation' between them. Such a
feedback-loop is shown between variables D and E in our example.
Double-headed arrows are used to represent a covariance between two variables, which might arise through a common cause or their reciprocal causation or both. In many treatments of path analysis, double-headed arrows may be placed only between variables that do not have causal arrows pointing at them. This convention allows us to discriminate between dependent/endogenous variables and independent/ultimate/exogenous variables.
Dependent variables are those variables we are trying to predict
(in a regression model) or whose intercorrelations we are trying to
explain (in a factor model). Dependent variables may be determined or
caused by either independent variables or other dependent variables or
both. In Figure 5.1, and
are the dependent
variables. Independent variables are the variables that explain
the intercorrelations between the dependent variables or, in the case
of the simplest regression models, predict the dependent variables.
The causes of independent variables are not represented in the model.
and
are the independent variables in
Figure 5.1.
Omission of a double-headed arrow reflects the hypothesis that two
independent variables are uncorrelated. In Figure 5.1
the independent variables and
correlate,
also correlates
with
, but
does not correlate with
. This illustrates (i)
that two variables which correlate with a third do not necessarily
correlate with each other, and (ii) that when two factors cause the
same dependent variable, it does not imply that they correlate. In
some treatments of path analysis, a double-headed arrow from an
independent variable to itself is used to represent its
variance, but this is often omitted if the variable is
standardized to unit variance.
By convention, lower-case letters (or numeric values, if these can be specified) are used to represent the values of paths or double-headed arrows, in contrast to the use of upper-case for variables. We call the values corresponding to causal paths path coefficients, and those of the double-headed arrows simply correlation coefficients (see Figure 5.1 for examples). In some applications, subscripts identify the origin and destination of a path. The first subscript refers to the variable being caused, and the second subscript tells which variable is doing the causing. In most genetic applications we assume that the variables are scaled as deviations from the means, in which case the constant intercept terms in equations will be zero and can be omitted from the structural equations.
Each dependent variable usually has a residual, unless it is fixed to zero ex-hypothesi. The residual variable does not correlate with any other determinants of its dependent variable, and will usually (but not always) be uncorrelated with other independent variables.
In summary therefore, the conventions used in path analysis: