Analysis of Variance (ANOVA)

§       Consider a situation in which we want to examine the effects of a drug but we are not sure what the dosage, if any, should be.

§       In this case, we might want to carry out an experiment in which we form 5 groups of subjects: one group (the control group) would receive a placebo and the remaining four groups would receive different dosages of the drug.

§       We would then measure all 5 groups in terms of some dependent variable (e.g., depression).

§       Now we are interested in 5 populations (rather than 2 in the case of the t-test).

§       We want to know if there are significant differences among the groups.

§       In other words, we want to know whether the groups are sampled from populations with different means.

§       The Analysis of Variance (ANOVA) deals with this situation.

§       Suppose that there are k (# of groups) populations of interest.

§       The method of analysis of variance enables us (under suitable conditions) to test the hypotheses:

§       Hypothesis testing in ANOVA is about whether the means of the samples differ more than you would expect if the null hypothesis were true.

§       This question about means is answered by analyzing variances.

§       Among other reasons, you focus on variances because when you want to know how several means differ, you are asking about the variances among those means.

§      

§       Within ANOVA, as with the t test, you do not know the true population variances.

§       As with the t test, you can estimate the variance of each of the populations from the scores in the samples.

§       Also, as with the t test, you assume in ANOVA that all populations have the same variance.

§       Because of this assumption, you can average the estimates from each sample into a single pooled estimate.

§       Null and Alternative Hypotheses in ANOVA:

§       Let us imagine that we are dealing with three groups - say a control group and two drug groups.

§       Now, we could use three different t-tests to compare the groups with the following hypotheses:

Why you can’t use multiple T-tests instead of ANOVA?

§       Because we are conducting multiple tests, there is a chance that one of the null hypotheses could be rejected even if they are all correct.

§       In other words, by carrying out multiple tests, we increase the chance of making a type I error (the null hypothesis is true, but we reject it; falsely rejecting the null).


overall alpha1) The t-test is subject to "Familywise error".

 

 

 

2) And Alpha (ά) = the Probability of making a Type I Error in each of those T-tests.

EXAMPLE:

§       5 levels (5 groups)

§       # T-tests comparing each of the 5 groups? = 10 t tests would have to be done.
1 vs. 2, 1 vs 3, 1 vs 4, 1 vs 5, 2 vs 3, 2 vs. 4, 2 vs. 5, 3 vs. 4, 3 vs. 5, and 4 vs. 5

§       (# groups)(# groups-1)/2 ==> # of T-tests

§       If we had 10 T-tests at alpha = .05,

overall alphaor 40% chance of a Type I Error occurring in one or all of those 10 T-tests!

The way you get around this problem is by using an ANOVA.

§       ANOVA --> alpha = .05 Keeps the overall alpha = .05

§       Although there are procedures for compensating for this problem, in ANOVA we usually start with an overall test that concerns all the means and use (as described above) the following hypotheses:

§       The following graph illustrates the null hypothesis in a case in which we have 5 populations.

§       Each of the histograms (i.e., curves) represents a population.

§       The means of the 5 populations are equal.

·       The graph also illustrates one of the assumptions underlying ANOVA - the assumption of HOMOGENEITY OF VARIANCE: the variances of the k populations (in this case k=5) are equal.

·        The populations are normally distributed and that the residual errors (the difference between a particular value in a population and the mean of that population) are independent.

Types of ANOVA:

§       One Way (One Factor) ANOVA:

       One independent variable (IV) with more than two levels (more than 2 groups)

§       Two Way (Two Factor) ANOVA:

     Two independent variables (simultaneous) with each    having two or more levels

§       You can also have 3 way anova’s , 4 way, etc. etc.

§       the numbers refer to the number of independent variables (or factors) you are looking at

The Basic Idea Underlying ANOVA

ANOVA example:

§       Craik and Lockhart (Journal of Verbal Learning and Verbal Behavior, Vol. 11, 1972) proposed a memory model in which memory for verbal material depends on initial processing or encoding.

§       To evaluate this model, Eysenck (Devel Psychology, Vol. 10, 1974):

§       Randomly assigned 50 subjects into one of 5 groups:

§       4 incidental learning groups (the subjects in these groups were not told they would have to recall items) and one intentional learning group:

  • counting: count number of letters in each of a list of words
  • rhyming: read each word and think of a word that rhymed
  • adjective: give an adjective to modify the each word
  • imagery: form a vivid image for each word
  • intentional: memorize each word for later recall

The following data represents the scores obtained. (They have actually been changed a little for explanatory purposes)

Number of words recalled as a function of level of processing

 

Counting

Rhyming

Adjective

Imagery

Intentional

Words
Recalled

9
8
6
8
10
4
6
5
7
7

7
9
6
6
6
11
6
3
8
7

11
13
8
6
14
11
13
13
10
11

12
11
16
11
9
22
12
10
19
11

10
19
14
5
10
11
14
15
11
11

7.0

6.9

11.0

13.3

12.0

3.33

4.54

6.22

20.27

14.00

§       The mean of all 50 values in the data set is called the grand mean SYMBOLIZED AS  

§       OR you can take the average of all the groups means to get a Grand Mean

 

The following graph shows the data in the above table:

·      

·       First there is considerable dispersion of values within each group.

·       However, there is also some dispersion between groups as measured by the differences among the means of the 5 groups.

·       The following graph shows just the means of each of the 5 groups:

·       If the Null Hypothesis is true (i.e., µ1 = µ2 = .. .= µk):

(i.e., if the 5 samples are drawn from 5 populations with equal means = NO DIFFERENCES BETWEEN THE 5 GROUPS)

               - then both the variation within groups and the variation between      groups reflect random or chance variation which we can call experimental error.

- AND we have two estimates of experimental error. In other words, The basic idea of ANOVA is to form a ratio of these two estimates. If the null hypothesis is true, then these two estimates should be similar.

·                               If the Null Hypothesis is False (THERE ARE DIFFERENCES BETWEEN AT LEAST 1 OF THE GROUIPS):

              -Then we will find that the variation between groups is large compared to the variation within groups

               -And we might suspect that there are real differences between the groups.

               - In other words, we might be willing to reject the null hypothesis and argue that the samples (i.e., groups) are drawn from populations with different means.

Number of Words recalled as a function of level of processing

 

Counting

Rhyming

Adjective

Imagery

Intentional

Words
Recalled

9
8
6
8
10
4
6
5
7
7

7
9
6
6
6
11
6
3
8
7

11
13
8
6
14
11
13
13
10
11

12
11
16
11
9
22
12
10
19
11

10
19
14
5
10
11
14
15
11
11

7.0

6.9

11.0

13.3

12.0

3.33

4.54

6.22

20.27

14.00

The grand mean is:

The Logic of ANOVA

·       If H0 is true (i.e., µ1 = µ2 = .. .= µk) then:

**More specifically, the F is the ratio of the between-samples estimate of the population variance over the within-samples estimate of the population variance**:

 

 

·       Each sample variance is an estimate of the variance of the population from which the sample was drawn. Thus:

·       Because of the assumption of homogeneity of variance, we

Have:

·       Thus, we have k different estimates of the population variance.

·       It makes sense to take an average of theses estimates as our within-samples estimate

·       THIS IS REFERRED TO AS MSwithin OR MSError. (Mean Squares Within or Error)-  the mean of the squared deviations around the mean within a particular group

·       where k is the number of samples or groups

·       Importantly, MSwithin does not depend on whether H0 is true or false.

·       This is because the variance within the populations from which the samples (or groups) are drawn is independent of differences between the populations.

·       In other words, MSwithin is a measure of variability of individual differences of subjects and measurement error

·       THUS MSwithin is ALSO called MSerror

 

2. Between-Samples Estimate of Variability

If the null hypothesis is true, the population variance:

can also be estimated from the sample means.

From the central limit theorem we know that the standard deviation of the sampling distribution of means is:

 

·       The between-samples estimate of the population variance is referred to as the mean squares between, or MSbetween.

·       The two estimates of population variance differ in one critical respect:

              - MSwithin is independent of whether H0 is true or false.

              - MSbetween is an estimate of the population only when the H0 is true. When the Ho is false MSbetween picks up on differences between the population and the sample group

·       Thus, if the two estimates agree (Within and Between estimates of population variability), we have support for retaining the H0

·       If the two estimates disagree we have support for rejecting the H0

Properties of an F Distribution:

·       The F distribution is a family of probability distributions (like the t distribution) that depend on the degrees of freedom associated with the F-test.

·       the F-test has degrees of freedom for the numerator and the denominator of the F-ratio.

DEGREES OF FREEDOM 1 way anova

·       The degrees of freedom for the numerator or BETWEEN GROUPS  is:

(k – 1) = called the DFbetween

·        where k is the number of groups from which we draw samples (the # of groups).

·       The degrees of freedom of the denominator is:

(N – k) = called the DFwithin

·        where N is the total number of observations in all groups or samples, minus the k, the number of groups.

(In many text books, n is used to represent to the number of observations in each group or sample and N represents the total number of observations)

 

·       The degrees of freedom for the F-test are often written as:

df = (k - 1, N - k) or (dfbetween, dfwithin)

The following graph shows the F distribution for 4 and 45 degrees of freedom. (Note that these are the df from the Eysenck example.)

·       The curve represents that distribution of F values that would be obtained if the null hypothesis is true and we repeated the experiment (with the same df) many, many times.

·       Recall that the F ratio is:

·       The value of F can never be negative (because a variance can not be negative).

·       If the null hypothesis is true, the F value should be close to 1 most of the time. However, if the null is not true, we can expect larger F values. Therefore, the F-test is ALWAYS a NON-DIRECTIONAL test

Why F is always Non-Directional:

·       In testing the hypothesis of no difference between two means, a distinction was made between directional and nondirectional alternative hypotheses.

·       Such a distinction no longer makes sense when the number of means exceeds two.

·       A directional test is possible only in situations where there are only two ways (directions) that the null hypothesis could be false.

·       H0 may be false in any number of ways.

·       Two or more group means may be alike and the remainder differ, all may be different, and so on.

 

The following graph again shows an F curve with 4 and 45 df. The pink area represents that upper 5% of the area under the curve.

§       The critical value of F associated with 4 and 45 degrees of freedom and a level of significance (alpha) of .05 is 2.61.

§       Also located in appendix in the back of the book

F Table
Critical values for alpha equals .05.

dfW

dfB

1

2

3

4

5

6

7

8

9

10

12

15

20

24

30

40

60

120

1

161.4

199.5

215.7

224.6

230.2

234.0

236.8

238.9

240.5

241.9

243.9

245.9

248.0

249.1

250.1

251.1

252.2

253.3

254.3

2

18.51

19.00

19.16

19.25

19.30

19.33

19.35

19.37

19.38

19.40

19.41

19.43

19.45

19.45

19.46

19.47

19.48

19.49

19.50

3

10.13

9.55

9.28

9.12

9.01

8.94

8.89

8.85

8.81

8.79

8.74

8.70

8.66

8.64

8.62

8.59

8.57

8.55

8.53

4

7.71

6.94

6.59

6.39

6.26

6.16

6.09

6.04

6.00

5.96

5.91

5.86

5.80

5.77

5.75

5.72

5.69

5.66

5.63

5

6.61

5.79

5.41

5.19

5.05

4.95

4.88

4.82

4.77

4.74

4.68

4.62

4.56

4.53

4.50

4.46

4.43

4.40

4.36

6

5.99

5.14

4.76

4.53

4.39

4.28

4.21

4.15

4.10

4.06

4.00

3.94

3.87

3.84

3.81

3.77

3.74

3.70

3.67

7

5.59

4.74

4.35

4.12

3.97

3.87

3.79

3.73

3.68

3.64

3.57

3.51

3.44

3.41

3.38

3.34

3.30

3.27

3.23

8

5.32

4.46

4.07

3.84

3.69

3.58

3.50

3.44

3.39

3.35

3.28

3.22

3.15

3.12

3.08

3.04

3.01

2.97

2.93

9

5.12

4.26

3.86

3.63

3.48

3.37

3.29

3.23

3.18

3.14

3.07

3.01

2.94

2.90

2.86

2.83

2.79

2.75

2.71

10

4.96

4.10

3.71

3.48

3.33

3.22

3.14

3.07

3.02

2.98

2.91

2.85

2.77

2.74

2.70

2.66

2.62

2.58

2.54

11

4.84

3.98

3.59

3.36

3.20

3.09

3.01

2.95

2.90

2.85

2.79

2.72

2.65

2.61

2.57

2.53

2.49

2.45

2.40

12

4.75

3.89

3.49

3.26

3.11

3.00

2.91

2.85

2.80

2.75

2.69

2.62

2.54

2.51

2.47

2.43

2.38

2.34

2.30

13

4.67

3.81

3.41

3.18

3.03

2.92

2.83

2.77

2.71

2.67

2.60

2.53

2.46

2.42

2.38

2.34

2.30

2.25

2.21

14

4.60

3.74

3.34

3.11

2.96

2.85

2.76

2.70

2.65

2.60

2.53

2.46

2.39

2.35

2.31

2.27

2.22

2.18

2.13

15

4.54

3.68

3.29

3.06

2.90

2.79

2.71

2.64

2.59

2.54

2.48

2.40

2.33

2.29

2.25

2.20

2.16

2.11

2.07

16

4.49

3.63

3.24

3.01

2.85

2.74

2.66

2.59

2.54

2.49

2.42

2.35

2.28

2.24

2.19

2.15

2.11

2.06

2.01

17

4.45

3.59

3.20

2.96

2.81

2.70

2.61

2.55

2.49

2.45

2.38

2.31

2.23

2.19

2.15

2.10

2.06

2.01

1.96

18

4.41

3.55

3.16

2.93

2.77

2.66

2.58

2.51

2.46

2.41

2.34

2.27

2.19

2.15

2.11

2.06

2.02

1.97

1.92

19

4.38

3.52

3.13

2.90

2.74

2.63

2.54

2.48

2.42

2.38

2.31

2.23

2.16

2.11

2.07

2.03

1.98

1.93

1.88

20

4.35

3.49

3.10

2.87

2.71

2.60

2.51

2.45

2.39

2.35

2.28

2.20

2.12

2.08

2.04

1.99

1.95

1.90

1.84

21

4.32

3.47

3.07

2.84

2.68

2.57

2.49

2.42

2.37

2.32

2.25

2.18

2.10

2.05

2.01

1.96

1.92

1.87

1.81

22

4.30

3.44

3.05

2.82

2.66

2.55

2.46

2.40

2.34

2.30

2.23

2.15

2.07

2.03

1.98

1.94

1.89

1.84

1.78

23

4.28

3.42

3.03

2.80

2.64

2.53

2.44

2.37

2.32

2.27

2.20

2.13

2.05

2.01

1.96

1.91

1.86

1.81

1.76

24

4.26

3.40

3.01

2.78

2.62

2.51

2.42

2.36

2.30

2.25

2.18

2.11

2.03

1.98

1.94

1.89

1.84

1.79

1.73

25

4.24

3.39

2.99

2.76

2.60

2.49

2.40

2.34

2.28

2.24

2.16

2.09

2.01

1.96

1.92

1.87

1.82

1.77

1.71

26

4.23

3.37

2.98

2.74

2.59

2.47

2.39

2.32

2.27

2.22

2.15

2.07

1.99

1.95

1.90

1.85

1.80

1.75

1.69

27

4.21

3.35

2.96

2.73

2.57

2.46

2.37

2.31

2.25

2.20

2.13

2.06

1.97

1.93

1.88

1.84

1.79

1.73

1.67

28

4.20

3.34

2.95

2.71

2.56

2.45

2.36

2.29

2.24

2.19

2.12

2.04

1.96

1.91

1.87

1.82

1.77

1.71

1.65

29

4.18

3.33

2.93

2.70

2.55

2.43

2.35

2.28

2.22

2.18

2.10

2.03

1.94

1.90

1.85

1.81

1.75

1.70

1.64

30

4.17

3.32

2.92

2.69

2.53

2.42

2.33

2.27

2.21

2.16

2.09

2.01

1.93

1.89

1.84

1.79

1.74

1.68

1.62

40

4.08

3.23

2.84

2.61

2.45

2.34

2.25

2.18

2.12

2.08

2.00

1.92

1.84

1.79

1.74

1.69

1.64

1.58

1.51

60

4.00

3.15

2.76

2.53

2.37

2.25

2.17

2.10

2.04

1.99

1.92

1.84

1.75

1.70

1.65

1.59

1.53

1.47

1.39

120

3.92

3.07

2.68

2.45

2.29

2.17

2.09

2.02

1.96

1.91

1.83

1.75

1.66

1.61

1.55

1.50

1.43

1.35

1.25

3.84

3.00

2.60

2.37

2.21

2.10

2.01

1.94

1.88

1.83

1.75

1.67

1.57

1.52

1.46

1.39

1.32

1.22

1.00

In the book: Numbers in regular font = p < 0.05; Numbers in bold = p < 0.01*

§       Because 45 is not in this table, but 40 is, always round down; anytime the actual number is not listed always round down to the next number so that the critical value is estimated to be larger (thus, harder to find signficance) and not smaller than it should be.

§       The critical value of F (4, 40) df is 2.61.

§       By selecting 40 instead of 60) we are being more conservative F WITH DF (4, 60) = 2.53. In other words, by using 2.61 instead of 2.53 we are using a higher critical value and are less likely to reject the null. If we do in fact reject that null, we can say that we did so even though we were conservative.

§       table you round down to 40 The table depicted here skips from 40 to 60 so use 40 instead

§       The table in the back of the book has values in between 40 and 60 (40, 42 ,44, …60), so we can then use 44 as the df because we do not have a value for 45 listed, round down to 44

 

Examples of ANOVA Calculations with Equal Sample Sizes

 

 

 

 

 

 

 

 

 

    

  

             

 

 

 

 

 

1: State Hypotheses:

2. Find Calculated F:

     A) Total Sum of Squares = SStotal  = OPTIONAL OPTIONAL OPTIONAL!

                  

where G Grand Total = (ΣX) = All x scores Across groups

 

 

Count

X2

Rhyme

X2

Adjective

X2

Imagery

X2

Intentional

X2

Words
Recalled

9
8
6
8
10
4
6
5
7
7

81

64

36

64

100

16

36

25

49

49

7
9
6
6
6
11
6
3
8
7

49

81

36

36

36

121

36

9

64

49

11
13
8
6
14
11
13
13
10
11

121

169

64

36

196

121

169

169

100

121

12
11
16
11
9
22
12
10
19
11

144

121

256

121

81

484

144

100

361

121

10
19
14
5
10
11
14
15
11
11

100

361

196

25

100

121

196

225

121

121

7.0

520

6.9

517

11.0

1266

13.3

1933

12.0

1566

3.33

 

4.54

 

6.22

 

20.27

 

14.00

 

5802   - (502)2      

                                                             50      5802 – 5040.08 = 761.92

1. Within-Samples Estimate of Variability:

      B) SSWithin is also called the SSError

                     


 T = (Σx) for each score within each Treatment Group

 

Count

X2

Rhyme

X2

Adjective

X2

Imagery

X2

Intentional

X2

Words
Recalled

9
8
6
8
10
4
6
5
7
7

81

64

36

64

100

16

36

25

49

49

7
9
6
6
6
11
6
3
8
7

49

81

36

36

36

121

36

9

64

49

11
13
8
6
14
11
13
13
10
11

121

169

64

36

196

121

169

169

100

121

12
11
16
11
9
22
12
10
19
11

144

121

256

121

81

484

144

100

361

121

10
19
14
5
10
11
14
15
11
11

100

361

196

25

100

121

196

225

121

121

7.0

520

6.9

517

11.0

1266

13.3

1933

12.0

1566

3.33

 

4.54

 

6.22

 

20.27

 

14.00

 

    5802- (70)2  + (69)2   + (110)2   +  (133)2   + (120)2

              10       10            10             10         10       =

5802 – 5385 = 417

 

 

C) SSbetween  
                                             

                                         

T = (Σx) for each score within each Treatment Group

 

Count

X2

Rhyme

X2

Adjective

X2

Imagery

X2

Intentional

X2

Words
Recalled

9
8
6
8
10
4
6
5
7
7

81

64

36

64

100

16

36

25

49

49

7
9
6
6
6
11
6
3
8
7

49

81

36

36

36

121

36

9

64

49

11
13
8
6
14
11
13
13
10
11

121

169

64

36

196

121

169

169

100

121

12
11
16
11
9
22
12
10
19
11

144

121

256

121

81

484

144

100

361

121

10
19
14
5
10
11
14
15
11
11

100

361

196

25

100

121

196

225

121

121

7.0

520

6.9

517

11.0

1266

13.3

1933

12.0

1566

3.33

 

4.54

 

6.22

 

20.27

 

14.00

 

 

(70)2  + (69)2   + (110)2   +  (133)2   + (120)2                        (502)2

  10       10            10             10         10        = 5385  -       50    =

 

5385- 5040.08 = 344.92

 

NOW WE NEED TO FIND THE Mean Square (MS)

Mean Square:

·       The Mean Square Is Just Each Different SS (Between Or Within) Divided By The Respective df (Between Or Within)

or

or

D)  SSwithin/df within  = 417

                                           (N- K)  45 =    9.27

where k is the number of samples or groups. Note that this is the standard formula for variance. We are simply computing the variance of the x scores means about each sample mean. Plugging in the numbers

E) 344.92

        4 =       86.23

3) Finally, F is given by the ratio of MSB over MSW

F = 86.23/9.27 = 9.30

 

4) Find F critical - we will use a 5% level of significance.

·      The degrees of freedom for the numerator (df between groups) is df = k-1 = 5-1 = 4.

·      The degrees of freedom for the denominator can also be written as df = N-k = 50-5 = 45.

·        FIND F CRITICAL = USE F TABLE GIVE OR LOOK IN F TABLE IN THE BACK OF THE BOOK

·     F.05 = 2.58 OR 2.61

·       IF USING THE TABLE GIVEN DF (4, 45)= 2.61  (MUST USE COLUMN FOR 40)

·       IF USING TABLE IN BACK OF BACK DF (4, 45) =  2.58 (MUST USE COLUMN FOR 45)

·     Decision: Because Fcalc > Fcrit, we reject the null hypothesis that the five samples are drawn from populations with the same mean.

·     More specifically, we can say that, assuming that the assumptions underlying ANOVA are satisfied, we have grounds for rejecting the null hypothesis.

 

 

The ANOVA Source Table

The results from the above example can be summarized in an ANOVA table as follows:

Source

df

SS

MS

F

Between Samples

4

344.92

86.23

9.30

Within Samples

45

417

9.27

 

Total

49

761.92

 

 

Relationship between F and t

Imagine the following research and that the following data were obtained: