ANalysis Of VAriance = ANOVA There are actually several different kinds of Analysis of Variance. We shall discuss the one that is most like the generalization of Independent t- tests. In Independent t-tests, of course, we tested whether 2 sample means were equal or unequal. In ANOVA we will test whether 2 or more sample means are all equal or whether one or more of them is not equal to the others. One way to show the formal hypotheses is: H Null: Mean1=Mean2=Mean3=Mean4 H1: H Null is false It may seem that we could solve the problem about equality of means by doing several t-tests. However, doing multiple t-tests changes the value of alpha away from what intend. For example, if we are comparing 4 Independent sample means, and set the value for alpha at .05 and then proceed to do the six t-tests, our actual, experiment-wise alpha-level is: 1-(1-.05)4 = 1-(.95)4 = 1-(.95)(.95)(.95)(.95) = 1 - .815 = .185 So, we set out to have alpha at .05 but have ended up with a very different (& much higher) alpha of .185 Sir Ronald Fisher figured out how to avoid this difficulty, when he invented ANOVA Fisher says, first do an ANOVA. IF and only IF the ANOVA rejects the null H do we proceed to do t- tests. If ANOVA says to accept the null H, then the computations are over. We are through: do not do t- tests. If ANOVA says Reject H0 and Accept H1 then we know that everything is NOT equal. Something is not equal --maybe only one among the many. Then we have to find what isn’t equal, using t-tests. General: When we have several related data sets, we can temporarily collapse them into one, and compute an overall sum, an overall (Grand) Mean, an overall sum of Xsquares, and an overall variance. The essence of ANOVA is that this overall variance can be broken down into parts. The parts represent the variance of the separate Sample Means around the Grand Mean. (drawing of group means, around grand mean) _____________________________________ We will practice with some ANOVA calculations shortly. ANOVA can analyze for differences among 2, 3, 4 or more samples. So, when we have two Independent Samples, we CAN use ANOVA if we want to, instead of using the t-test. For 2 groups, t and ANOVA are equivalent. For many ANOVA problems, it is convenient to use a double-subscript for the individual scores, as in Xij This is because the data usually form a matrix, with rows and columns, as Col-1 Col-2 Col-3 Col-4 Row-1 67 92 101 66 Row-2 65 93 89 51 Row-3 63 85 91 46 Row-5 51 89 92 55 Row-6 64 91 105 41 Row-7 68 88 99 51 Row-8 66 85 97 46 When reading the double sub-script, we always read the row first, then the column (mnemonic 'Roman Catholic') so X51 would be the score 71 X23 would be the score 89 X84 would be the score 46 Quite often, we use the Columns to represent different experimental treatments (as drug amounts; practice time, etc.) and the rows to represent the subjects who had that particular treatment. So our matrix could represent 0 grams, 1 gram, 2 grams or 4 grams of some experi-mental drug we administered. The 0-drug group would be the CONTROL Group -- i.e., controlling for handling, injecting, etc. And each of the Groups here had 8 subjects who had that particular drug treatment. Let's compute the Group (Column) sums and means: Col-1 Col-2 Col-3 Col-4 (g drug) 0 1 2 4 Row-1 67 92 101 66 Row-2 65 93 89 51 Row-3 59 77 94 62 Row-4 63 85 91 46 Row-5 51 89 92 55 Row-6 64 91 105 41 Row-7 68 88 99 51 Row-8 66 85 97 46 Sums 503 700 768 418 Means 62.88 87.50 96.00 52.25 By-the-Way: Sum of Xij2 = 189557 The null H says these means are all the same (within sampling). The alternative H says they are not. Let's compute the Grand Sum = 503 + 700 + 768 + 418 = 2,389 Grand Mean = Grand Sum / Total n = 2,389 / 32 = 74.66 Note: We can see that the sample means are fluctuating around the Grand Mean. Is this fluctuation just sampling error? Or are the drug-group means statistically different from each other? This is what ANOVA helps us answer. PROCEDURES: Compute the Correction Factor, C. C =(Grand Sum)(Grand Sum) / Total n C = (2389)(2389)/ 32 = 178,354 C is a quantity that ‘centers’ the distribution of scores on 0. In the defining equation for variance, the numerator shows (X-X)2 The ‘-X‘ bit did the centering on 0. C does the same thing. Let’s compute the sum of X's squared, i.e., square each X, and then sum the squared X's Xij = 189557 Total Sum of Squares = SS SS = Xij - C = 189557 - 178354 = 11203 Variance = SS /df = 11203/31 = 361.39 This over-all variance is what ANOVA will be breaking down into parts --including the parts ascribable to the different drug conditions. The ANOVA procedure will assess whether the variation of the drug-group means, about the Grand Mean is, or is not, statistically significant. This means that the individual scores within a drug-group vary about their individual group-mean. This is just sampling variation. Then, in addition, the group means vary about the Grand Mean. This is the effect that we are testing for, ANOVA assesses whether the variation of the Group Means about the Grand Mean is just random, or if it is statistically significant. ANOVA accomplishes this by forming a ratio of the variance due to Group-means fluctuating around the Grand Mean, to that average variance (sampling variation) that occurred within the Groups themselves. The test statistic for ANOVA is called F (honoring Fisher) To use a table of F, you need two things: the df for the numerator of the F-ratio (i.e., # of Groups minus 1), and the df for the denominator. The df for the denominator is essentially the n of the data points that estimate the sampling variation -- the random or unexplained part. When doing calculations for ANOVA, we calculate one or more special variances, called mean squares. The variance due to the fluctuation of Group means about the Grand Mean is called mean square - between or msb Some authors are more grammatical, and call this component mean square - among or msa when dealing with more than two groups. The average variance within groups, around each separate group mean, is called mean square - within or msw It is a measure of the random variation present. We then compute an F-ratio: F = msb / msw -And then look it up in the F-table. CALCULATION Calculate Sum of Squares Between = SSB SSB = (Tj2/nj) -C ** the Tjs are the column sums ** SSB = (503)(503)/8 + (700)(700)/8 + (768) (768)/8 + (418)(418)/8 - 178,354 = = 31,626 + 61,250 + 73,728 + 21,840 - 178,354 SSB = 188,444 - 178,354 = 10,090 There are j=4 Groups, so dfB = 3 msB = SSB / df = 10090 / 3 = 3363.3 ** Now we calculate msW ** TSS = SSB + SSW SSW = TSS - SSB = 11203 - 10090 = 1113 Calculate df There are 32 data points We used up 3 df for the Groups We used up 1 df for the Grand Mean dfW = 32 - 3 -1 = 28 msW = SSW / dfW = 1113 / 28 = 39.75 We are ready to calculate F3,28 F3,28 = msB / msW = 3363.3 / 39.75 = 84.61 Examine Table A.4 to get the critical F-value. For alpha = .05, the critical value is at 3 df x 28 df = 2.95 Our F obtained is far greater. We reject H0 and accept HA We conclude that the drug Groups are not equivalent. We would now have to search further, using t-tests. If we had accepted H0, we would end right here -- no t-tests. Since we have a significant result, we now want to know which Groups differ. Since we have done several of these Independent Groups t-tests in class, I did these by computer, to save class time: 1 vs 2 t = -9.18 SED=2.682 p < .001 1 vs 3 t = -12.03 SED=2.754 p < .001 1 vs 4 t = 2.97 SED=3.572 p < .02 2 vs 3 t = -3.19 SED=2.666 p < .01 2 vs 4 t = 10.06 SED=3.504 p < .001 3 vs 4 t = 12.29 SED=3.559 p < .001 If we plot these Group means, we get a good picture of what is happening: 800| X 700| X 600| 500| X 400| X 300| 200| 100|______________________________ 0 2 3 4 Group Drug groups 2 and 3 are increasing the score significantly, but drug group 4 score decreases significantly below that of the Controls. This is not an unusual finding, with drug studies.