next up previous index
Next: 5 Using PRELIS with Up: 3 Ordinal Data Analysis Previous: 3 Testing the Normal   Index

4 Terminology for Types of Correlation

One of the difficulties encountered by the newcomer to statistics is the use of a wide variety of terms for correlation coefficients. There are many measures of association between variables; here we confine ourselves to the parametric statistics computed by normal theory. These statistics correspond most naturally to our genetic theory, in which we assume that a large number of independent genetic and environmental factors give rise to variation -- ``multifactorial inheritance"[*]. Table 2.2 shows the name given to the correlation coefficient calculated under normal distribution theory, according to whether each variable has: two categories (dichotomous); several categories (polychotomous); or an infinite number of categories (continuous). If both variables are dichotomous, then the correlation is called a tetrachoric correlation as long as it is calculated using the bivariate normal integration approach described in Section 2.3 above. If we simply use the Pearson product moment formula (described in Section 2.2.1 above) then we have computed a phi-coefficient which will probably underestimate the population correlation in liability. Because the tetrachoric and polychoric are calculated with the same method, some authors refer to the tetrachoric as a polychoric, and the same is true of the use of polyserial instead of biserial. As we shall see, the theory behind all these statistics is essentially the same.


Table 2.2: Classification of correlations according to their observed distribution.
  Two Three or more  
Measurement Categories Categories Continuous
Two Tetrachoric Polychoric Biserial
Three or more Polychoric Polychoric Polyserial
Continuous Biserial Polyserial Product Moment


next up previous index
Next: 5 Using PRELIS with Up: 3 Ordinal Data Analysis Previous: 3 Testing the Normal   Index
Jeff Lessem 2002-03-21