** Next:** 3 Using PRELIS to
** Up:** 2 Using SAS or
** Previous:** 2 Using SAS or
** Index**

###

1 SAS scripts to compute covariance matrices

This is not the place to describe in detail the workings
of SAS; the thousands of pages in the manuals are quite adequate! All
we aim to do here is to get the data in and get the covariance matrix
and means out. SAS has a useful procedure, PROC CORR, which will
print the required statistics, which can be cut and pasted into a file
for Mx use. However, as is commonly the case with computer tasks,
investing a little extra initial work on automation will save labor
in the long run, and will be more error-proof.
It often happens that data are stored at the individual subject level
rather than at the family level. Typically, each subject has a family
number and an `id' number to mark their position in the family (first
or second twin). A necessary step to analyse the covariance between
relatives is to `glue' the data from family members together so that
the family becomes the unit of measurement and covariances between
family members may be computed. In SAS this is a relatively simple
operation although care must be taken to supply labels for the
variables that do not exceed the SAS maximum length of eight
characters. The SAS script in Appendix shows the
case for twin data, and goes beyond the initial requirement by taking
the sex of the twins into account. Five groups are created, being MZ
male, DZ male, MZ female, DZ female and opposite DZ. The covariances
are computed and output to .dat files which contain the number of
observations (`Nobservations`

), the number of input variables
(`NInput`

), labels, and the covariance matrices (CMatrix).
These .dat files may be used directly in Mx in a diagram, or in a
script using the `Include`

statement.
Note that the assignment of the twins as 1 or 2 is usually arbitrary for
the same sex groups, but in the opposite sex group the male (or female)
twin is always first, and the female (or male) twin second.
Strictly speaking, when there is no inherent order to
the observations the variance-covariance matrix is not the best
summary statistic to use. The intraclass correlation is the most
appropriate summary for observations that do not have any order; it
uses a joint estimate of the variance of twin 1 and twin 2, and
partitions this into within pairs and between pairs components.
However, the intraclass correlation is more difficult to generalize to
the multivariate and multiple classes of relatives situations so we
stay with covariance matrices here. Sometimes data on birth order or
some other characteristic may be used to distinguish more formally
between twin 1 and twin 2 within a pair, thereby giving some
rationality to the ordering and use of covariance matrices. Should
such an approach be taken, it is necessary to split the DZ
opposite sex twin group into two groups according to whether the first
twin is female or male.
Appendix shows a SAS macro for creating an Mx .dat file,
which fully describes the data: the variable labels, the sample size,
the means and covariances. Comments, beginning with ! indicate the
date the file was created. The resulting .dat file might look like this:
!
! Mx dat file created by SAS on 03FEB1998
!
Data NInputvars=4 NObservations=844
CMatrix Full
1.0086 -0.0148 -0.0317 -0.0443
-0.0148 1.0169 -0.0062 0.0068
-0.0317 -0.0062 0.9342 0.0596
-0.0443 0.0068 0.0596 0.9697
Means
0.0139 -0.0729 0.0722 0.0159
Labels T1F1 T1F2 T2F1 T2F2

As will be seen in later chapters, this file is ready for immediate
use for drawing path diagrams in the Mx GUI or in an Mx script with
the `#include`

command.

** Next:** 3 Using PRELIS to
** Up:** 2 Using SAS or
** Previous:** 2 Using SAS or
** Index**
Jeff Lessem
2002-03-21