COMPARING FREQUENCIES USING
With chi square,
we are looking at frequency counts, not scores as we did in previous chapters.
· The statistic that measures
the discrepancy between the observed values and the expected values in a
contingency table.
· Used for measuring
frequencies- thus your data may be NOMINAL in nature
· Nonparametric statistic,
also used for data that violates assumptions of normal distribution.,
thus it is a “distribution Free” test.- no assumptions are made about population
parameters.
Ex. The following require
frequency data:
Ex.
Toss coin 100 times:
The
frequencies that you have observed as a result of the 100 tosses:
H/57, T/43
In contrast to observed frequencies, expected or theoretical frequencies are what you think you will obtain when you conduct the experiment.
The numbers we observe on the average if the null hypothesis is true.
Ho: The coin is fair.
P(h) = .5
P(h) = Pt
The proportion of heads = the proportion of tails
H1: The coin is not fair.
P(h) not = .5
P(h) not = Pt
The theoretical
or expected frequencies if the coin is in fact fair:
H/50, T/50
These expected frequencies are based on 100 tosses.
If the observed frequencies are not far away from the expected frequencies we can say that we have insufficient evidence of an unfair coin, and may need more information.
The test
statistic for this problem:
O = Observed
Frequency
E = Expected
Frequency
Df = k-1

1. Take the observed frequency
for each category and subtract the expected frequency.
2. Square this difference.
3. Divide by the expected
frequency.
4. Add all the numbers up.
Decision
Rule:
·
alpha = .05
·
df = # of categories - 1 =
"K - 1", where K is equal to the number of categories.
·
Use Table to find the CRITICAL CHI-SQUARE VALUES
·
Similar to the f test, chi square is treated as a one-tailed,
right-tailed test.
·
Chi square value is NEVER negative.
For df = 1 and alpha = .05, the
critical value is 3.84.
|
df |
P = 0.05 |
P = 0.01 |
P = 0.001 |
|
1 |
3.84
|
6.64
|
10.83
|
|
2 |
5.99
|
9.21
|
13.82
|
|
3 |
7.82
|
11.35
|
16.27
|
|
4 |
9.49
|
13.28
|
18.47
|
|
5 |
11.07
|
15.09
|
20.52
|
|
6 |
12.59
|
16.81
|
22.46
|
|
7 |
14.07
|
18.48
|
24.32
|
|
8 |
15.51
|
20.09
|
26.13
|
|
9 |
16.92
|
21.67
|
27.88
|
|
10 |
18.31
|
23.21
|
29.59
|
|
11 |
19.68
|
24.73
|
31.26
|
|
12 |
21.03
|
26.22
|
32.91
|
|
13 |
22.36
|
27.69
|
34.53
|
|
14 |
23.69
|
29.14
|
36.12
|
|
15 |
25.00
|
30.58
|
37.70
|
|
16 |
26.30
|
32.00
|
39.25
|
|
17 |
27.59
|
33.41
|
40.79
|
|
18 |
28.87
|
34.81
|
42.31
|
|
19 |
30.14
|
36.19
|
43.82
|
|
20 |
31.41
|
37.57
|
45.32
|
|
21 |
32.67
|
38.93
|
46.80
|
|
22 |
33.92
|
40.29
|
48.27
|
|
23 |
35.17
|
41.64
|
49.73
|
|
24 |
36.42
|
42.98
|
51.18
|
|
25 |
37.65
|
44.31
|
52.62
|
|
26 |
38.89
|
45.64
|
54.05
|
|
27 |
40.11
|
46.96
|
55.48
|
|
28 |
41.34
|
48.28
|
56.89
|
|
29 |
42.56
|
49.59
|
58.30
|
|
30 |
43.77
|
50.89
|
59.70
|
|
31 |
44.99
|
52.19
|
61.10
|
|
32 |
46.19
|
53.49
|
62.49
|
|
33 |
47.40
|
54.78
|
63.87
|
|
34 |
48.60
|
56.06
|
65.25
|
|
35 |
49.80
|
57.34
|
66.62
|
|
36 |
51.00
|
58.62
|
67.99
|
|
37 |
52.19
|
59.89
|
69.35
|
|
38 |
53.38
|
61.16
|
70.71
|
|
39 |
54.57
|
62.43
|
72.06
|
|
40 |
55.76
|
63.69
|
73.41
|
|
41 |
56.94
|
64.95
|
74.75
|
|
42 |
58.12
|
66.21
|
76.09
|
|
43 |
59.30
|
67.46
|
77.42
|
|
44 |
60.48
|
68.71
|
78.75
|
|
45 |
61.66
|
69.96
|
80.08
|
|
46 |
62.83
|
71.20
|
81.40
|
|
47 |
64.00
|
72.44
|
82.72
|
|
48 |
65.17
|
73.68
|
84.03
|
|
49 |
66.34
|
74.92
|
85.35
|
|
50 |
67.51
|
76.15
|
86.66
|
|
51 |
68.67
|
77.39
|
87.97
|
|
52 |
69.83
|
78.62
|
89.27
|
|
53 |
70.99
|
79.84
|
90.57
|
|
54 |
72.15
|
81.07
|
91.88
|
|
55 |
73.31
|
82.29
|
93.17
|
|
56 |
74.47
|
83.52
|
94.47
|
|
57 |
75.62
|
84.73
|
95.75
|
|
58 |
76.78
|
85.95
|
97.03
|
|
59 |
77.93
|
87.17
|
98.34
|
|
60 |
79.08
|
88.38
|
99.62
|
|
61 |
80.23
|
89.59
|
100.88
|
|
62 |
81.38
|
90.80
|
102.15
|
|
63 |
82.53
|
92.01
|
103.46
|
|
64 |
83.68
|
93.22
|
104.72
|
|
65 |
84.82
|
94.42
|
105.97
|
|
66 |
85.97
|
95.63
|
107.26
|
|
67 |
87.11
|
96.83
|
108.54
|
|
68 |
88.25
|
98.03
|
109.79
|
|
69 |
89.39
|
99.23
|
111.06
|
|
70 |
90.53
|
100.42
|
112.31
|
|
71 |
91.67
|
101.62
|
113.56
|
|
72 |
92.81
|
102.82
|
114.84
|
|
73 |
93.95
|
104.01
|
116.08
|
|
74 |
95.08
|
105.20
|
117.35
|
|
75 |
96.22
|
106.39
|
118.60
|
|
76 |
97.35
|
107.58
|
119.85
|
|
77 |
98.49
|
108.77
|
121.11
|
|
78 |
99.62
|
109.96
|
122.36
|
|
79 |
100.75
|
111.15
|
123.60
|
|
80 |
101.88
|
112.33
|
124.84
|
|
81 |
103.01
|
113.51
|
126.09
|
|
82 |
104.14
|
114.70
|
127.33
|
|
83 |
105.27
|
115.88
|
128.57
|
|
84 |
106.40
|
117.06
|
129.80
|
|
85 |
107.52
|
118.24
|
131.04
|
|
86 |
108.65
|
119.41
|
132.28
|
|
87 |
109.77
|
120.59
|
133.51
|
|
88 |
110.90
|
121.77
|
134.74
|
|
89 |
112.02
|
122.94
|
135.96
|
|
90 |
113.15
|
124.12
|
137.19
|
|
91 |
114.27
|
125.29
|
138.45
|
|
92 |
115.39
|
126.46
|
139.66
|
|
93 |
116.51
|
127.63
|
140.90
|
|
94 |
117.63
|
128.80
|
142.12
|
|
95 |
118.75
|
129.97
|
143.32
|
|
96 |
119.87
|
131.14
|
144.55
|
|
97 |
120.99
|
132.31
|
145.78
|
|
98 |
122.11
|
133.47
|
146.99
|
|
99 |
123.23
|
134.64
|
148.21
|
|
100 |
124.34
|
135.81
|
149.48
|
So the decision
rule is to reject ho if the Chi-Square test statistic is greater than 3.84,
otherwise do not reject ho.
Decision: Since
1.96 is less than 3.84, Do not reject Ho.
Conclusion:
There is insufficient evidence of an unfair coin.
Chi square is
called a "Goodness of Fit" test because we want to see if observed
frequencies fit the theoretical frequencies.
Ex. Given N = 200 consumers:
|
Theoretical frequencies
(percentages) |
Observed frequencies |
|
A
= 38% |
A
= 80 |
|
B
= 27% |
B
= 50 |
|
C
= 35% |
C
= 70 |
|
Total
= 100% |
Total
= 200 |
Question: Does
the observed data fit the theoretical data?
IF YOU ARE GIVEN PERCENTAGES, THE PERCENTAGES MUST BE CONVERTED TO FREQUENCIES BY
MULTIPLYING THE PERCENTAGE BY THE TOTAL NUMBER OF CONSUMERS.
EXPECTED
FREQUENCY =
(EXPECTED
PROPORTION) (N)
Expected
(theoretical) frequencies:
A =
38% of 200 = 76
B = 27% of 200 = 54
C = 35% of 200 = 70
Ho: Proportion
of A = .38
Ho: Proportion of B = .27
Ho: Proportion of C = .35
H1: Proportion
of A does not = .38
H1: Proportion of B does not = .27
H1: Proportion of C does not = .35
Chi square
value:

Decision
rule:
·
alpha = .05
·
df = K - 1 = 3 - 1 = 2, (where
there were 3 categories - 1 = 2)
·
Using Chi square table, find the critical value = 5.99
·
Reject Ho if the chi square test statistic > 5.99, otherwise do not
reject ho.
Decision:
Since 0.5068 < 5.99, do not reject ho.
Conclusion:
There is insufficient evidence of the lack of fit, not enough evidence to
refute the researchers proportional claims.
2 way
With variables
that are categorical we need to use chi square to determine if they are
related.
Ex. Is gender related to
political affiliation?
|
Gender |
Political |
|
Male |
D |
|
Female |
R |
These are
categorical data.
You must use chi square for a Contingency Table.
Data Table:
|
People |
Gender |
Political |
|
1 |
M |
D |
|
2 |
F |
D |
|
3 |
F |
D |
|
4 |
M |
R |
|
5 |
M |
R |
|
6 |
M |
D |
|
7 |
F |
D |
|
8 |
F |
R |
|
9 |
F |
D |
|
10 |
F |
R |
|
|
Demo |
Repub |
Total |
|
Gender |
|
|
|
|
Male |
2 |
2 |
4 |
|
|
|
|
|
|
Female |
4 |
2 |
6 |
|
|
|
|
|
|
Total |
6 |
4 |
10 |
|
4 fold
Contingency Table |
|
||
Expected
frequencies in a contingency table are computed using observed frequency
data.
Expected frequency =
(Column Total) Row Total)
(Overall Total)
|
OBSERVED VALUES |
|
||
|
|
|
||
|
Political |
Demo |
Repub |
Total |
|
Gender |
|
|
|
|
Male |
2 |
2 |
Row1= 4 |
|
|
|
|
|
|
Female |
4 |
2 |
Row2= 6 |
|
|
|
|
|
|
Total |
Column1 = 6 |
Column 2 = 4 |
N = 10 |
|
Expected VALUES |
|||
|
|
|||
|
Political |
Demo |
Repub |
Total |
|
Gender |
|
|
|
|
Male |
C1xR1/N = 2.4 |
C2xR1/N = 1.6 |
Row1= 4 |
|
|
|
|
|
|
Female |
C1xR2/N = 3.6 |
C2xR2/N = 2.4 |
Row2= 6 |
|
|
|
|
|
|
Total |
Column1 = 6 |
Column2 = 4 |
N = 10 |
In the expected
frequency table the cells represent counts one would expect if the two
categorical variables are totally unrelated.
The chi square
says that if observed frequencies fit the expected frequencies, we know that
the variables are also not related or are independent of one another.

Decision rule
at alpha = .05
·
df = (# rows - 1)(# columns -
1) = 1
·
Use table for chi square critical value = 3.84
·
Reject ho if the chi square test statistic > 3.84, otherwise do not
reject ho.
Decision:
Since 0.2778 < 3.84, do not reject ho.
Gender and politics are not related.
Is therapy and
improvement related?
Ho: Therapy and improvement are
not related (independent).
H1: Therapy and improvement are related (dependent).
Observed data:
|
OBSERVED VALUES |
|||
|
|
|||
|
Improvement |
YES |
NO |
Total |
|
Type |
|
|
|
|
Therapy |
75 |
25 |
R1= 100 |
|
|
|
|
|
|
Placebo |
58 |
42 |
R2= 100 |
|
|
|
|
|
|
Total |
C1 = 133 |
C2 = 67 |
N = 200 |
Expected data:
|
Expected VALUES |
|
||
|
|
|
||
|
Improvement |
YES |
NO |
Total |
|
Type |
|
|
|
|
Therapy |
C1xR1/N = 66.5 |
C2xR1/N = 33.5 |
R1= 100 |
|
|
|
|
|
|
Placebo |
C1xR2/N = 66.5 |
C2xR2/N = 33.5 |
R2= 100 |
|
|
|
|
|
|
Total |
C1 = 133 |
C2 = 67 |
N = 200 |

Decision
rule:
·
alpha = .05
·
df = 1, df
= (# rows - 1)(# columns - 1)
·
Table, critical value = 3.84
·
Reject Ho if the chi square statistic is greater than 3.84, otherwise
do not reject ho.
Decision:
Reject ho since 6.49 > 3.84
Conclusion:
There is evidence that therapy is related to improvement.