next up previous index
Next: 2 Alternate Representation of Up: 3 Simple Genetic Factor Previous: 3 Simple Genetic Factor   Index

1 Multivariate Genetic Factor Model

Using genetic notation, the genetic factor model can be represented as

P_{ij} = a_i A_j + c_i C_j + e_i E_j + U_{ij} \;


i & = & 1, \cdots, p \mbox{ (variables)}\\
j & = & 1, \cdots, n \mbox{ (subjects)}

The measured phenotype ($P$) (again, omitting the $j$ subscript) consists of multiple variables that are a function of a subject's underlying additive genetic deviate ($A$), common (between-families) environment ($C$), and non-shared (within-families) environment ($E$). In addition, each variable $P_j$ has a specific component $U_j$ that itself may consist of a genetic and a non-genetic part. In this initial application, we assume that $U_j$ is entirely random environmental in origin, an assumption we relax later. Parameters $a$, $c$, and $e$ are the $p$-variate factor loadings of measured variables on the latent factors. A path diagram of this model is shown in Figure [*].

% latex2html id marker 11051
{Multivariate Genetic Factor model for four variables.}

In Mx, there are a number of alternative ways to specify the model. One approach is to specify the factor structure for the genetic, shared and specific environmental factors in one matrix, e.g. B with twice the number of variables (for both twins) as rows and the number of factors for each twin as columns. If we assume one genetic, one shared environmental and one specific environmental common factor per twin $(A_1, A_2, C_1, C_2, E_1, E_2)$ for our four-variate arithmetic computation example (shown as T0 - T3 to represent administration times 0-3 before and after standard doses of alcohol for twin 1 (Tw1) and twin 2 (Tw2) respectively), the B matrix would look like

& & A_1 & C_1 & E_1 & A_2 & C_2 &...
\mbox{Tw2-T3} & & 0 & 0 & 0 & 4 & 8 & 12 \\

In this case with $m=6$ factors and four observed variables for each twin (p=8), B would be a $p\times m$ ($8\times 6$) matrix of the factor loadings, P the $m \times m$ correlation matrix of factor scores, and E a $p\times p$ diagonal matrix of unique variances. The expected covariance may then be calculated as in equation 10.1:
\Sigma_{Y,Y} = \bf B \bf P \bf B' + \bf E.
\end{displaymath} (61)

In a multivariate analysis of twin data according to this factor model, $\Sigma$ is a $2p\times 2p$ predicted covariance matrix of observations on twin 1 and twin 2 and B is a $2p\times 2m$ matrix of loadings of these observations on latent genotypes and non-shared and common environments of twin 1 and twin 2. The factor loadings between $A_1$ and $A_2$, $E_1$ and $E_2$, and $C_1$ and $C_2$ are constrained to be equal for twin 1 and twin 2, similar to the path coefficients of the univariate models discussed in previous chapters. The equality constraints on the parameters are obtained in Mx by using the same non-zero parameter number in a Specification statement for the free parameters. The unique variances also are equal for both members of a twin pair. These may be estimated on the diagonal of the $2p\times 2p$ E matrix (e.g., Heath et al., 1989c). To fit this model, B and E are estimated from the data and P ($2m\times 2m$) must be fixed a priori (for example, the correlation between $A_1$ for twin 1 and $A_2$ for twin 2 is 1.0 for MZ and 0.5 for DZ twins; the correlation between the $C$ variables of twin 1 and twin 2 is 1.0). One alternative specification of this model is to include the unique variances in matrix B and fix E to zero. The factor patterns for $A$ and $E$ of twin 1 and twin 2 are identical to that in Section 10.2.3. The main difference lies in the treatment of the unique variances. In the earlier example these were estimated as variances on the diagonal of E, but now they are modeled as the square roots of the variances. These quantities are now square roots because the unique variances are calculated as the product $\bf B \bf P
\bf B'$ in the expected covariance expression whereas in the previous example the quantities were estimated as the unproducted quantity E. One might expect that this subtle change would have no effect on the model (as indeed it does not in this example), but on occasion these alternative residual specifications may produce different outcomes. The situation of residual variances $< 0.0$ makes little sense in genetic analyses because it implies an impossible negative variance component. Consequently, although it may be possible to make alternative representations like this in Mx, we recommend this model, as it constrains unique variances to be $\ge 0.0$. Nevertheless, both methods give identical solutions when fitted to the data used in these examples.
next up previous index
Next: 2 Alternate Representation of Up: 3 Simple Genetic Factor Previous: 3 Simple Genetic Factor   Index
Jeff Lessem 2002-03-21