MZ.RAW
in the following way:
3 2 3 3 . . . . . . 2 1 1 2We can use ``free format'' to read these data. Free format means that there is at least one space or end-of-line character between consecutive data items. These data could be entered using any simple text editor such as the Norton Editor. If a wordprocessor such as Wordperfect or Microsoft Word were used, it would be necessary to save the file as a DOS or ASCII text file. Next, we would prepare an ASCII file containing the PRELIS commands to read these data and compute the means and covariances. We refer to files containing program commands as `scripts'; the PRELIS script in this case might look like this:
Simple prelis example to compute MZ covariances DA NI=2 NO=0 LA Twin1 Twin2 COntinuous Twin1 Twin2 RAw FIle=MZ.RAW OU SM=MZ.COV MA=CMThe first line is simply a title. PRELIS will treat all lines as part of the title until a line beginning with
DA
is encountered. The DA
line
is used to specify basic features of the input (raw) data such as the number of
input variables (NI
) and the number of observations. Here we have
specified the number of observations as zero (NO=0
), which asks PRELIS to count the
number of cases for us. The next two lines of the script supply labels
(LA
) for the
variables; these are optional but highly recommended when more than a few
variables are to be read. Next, we define the variables Twin1 and Twin2 as
continuous. By default, PRELIS 2 will treat any variable with less than 15
categories as ordinal. Although this is a reasonable statistical approach, it
is not what we want for the purposes of this example. The next line in the
script (beginning RA
) tells PRELIS where to find the data, and the last
line signifies the end of the script, and requests the covariances
(MA=CM
) to be saved in the file MZ.COV
. This output file is
created by PRELIS -- it is also ASCII format and looks like this:
(6D13.6) .106667D+01 .800000D+00 .106667D+01The first line of the file contains a FORTRAN format for reading the data. The reader is referred to almost any text on FORTRAN, including User's Guides, for a detailed description of formats. The format used here is D format, for double precision. The 3 characters after the D give the power of 10 by which the printed number should be multiplied, so our .106667D+01 is really
CMatrix File=MZ.COVwithin an Mx script -- Mx by default expects only the lower triangle of covariance matrices to be supplied.
Suppose that, instead of just two variables, we had a data file with 20 variables per subject, with two lines for a twin pair. Also suppose that one of the variables identifies the zygosity of the pair, we wish to select only those pairs where zygosity is 1, and we only want the covariance of four of the variables. We could read these data into PRELIS using a FORTRAN format statement explicitly given in the PRELIS script. The script might look like this:
PRELIS script to select MZ's and compute covariances of 4 variables DA NI=40 NO=0 LA Zygosity Twin1P1 Twin1P2 Twin2P1 Twin2P2 RAw FIle=MZ.RAW FO (3X,F1.0,2x,F5.0,12X,F5.0/6X,F5.0,12X,F5.0) SD Zygosity=1 OU SM=MZ.COV MA=CMThe
SD
command selects cases where zygosity is 1, and deletes zygosity from the
list of variables to be analyzed. Note that the FORTRAN format implicitly skips
all the irrelevant variables, retaining only five (as specified by
the F1.0 and F5.0 fields). Although we could have started with a more
complete list of variables, read them in with an appropriate FORMAT, and used
the PRELIS command SD
to delete those we didn't want, it is more
efficient to save the program the trouble of reading these data by adjusting
our NI
and format statement. On the other hand, if the data file is not
large or if a powerful computer is available, it may be better to use SD
to
save user time spent modifying the script.