Bayesian Reasoning
posted March 23, 2005
No reading assignment
Representativeness Heuristic
-
Descriptive Theory
-
Probability judgments based on similarity or representativeness
-
If Linda is similar to a feminist bankteller, than Linda is likely to be
a feminist bankteller
-
If A is similar to B, then A is likely to be related to B or to be caused
by B.
Resulting Biases
-
Conjunction fallacy
-
Base rate neglect
-
Insensitivity to sample size
-
Misconceptions of chance
-
Insensitivity to predictability
-
Illusion of validity
-
Misconceptions of regression
Conjunction Fallacy
-
Linda’s description is more ________________ of a feminist bank teller
than of a bank teller.
-
Conjunction rule and probability theory are irrelevant to representativeness
Insensitivity to Sample Size
-
Law of small numbers
-
Because a sample is drawn from a population, it should be ____________
of the population
-
______________ does not influence representativeness.
Misperceptions of __________________
-
A fair coin is tossed 6 times. Which sequence is more likely:
-
Both are equally likely (____________)
-
But the first seems more random
-
A __________ string of coin tosses is a sample of a larger population,
so should be representative.
-
The larger population is a random sequence and will look random
-
Thus, the short string should look random
-
– E.g., 50% heads, no detectable pattern, mix of alterations and runs
-
___________ fallacy: expect random sequences to be self-correcting.
-
– If the last 5 tosses were heads, the coin is “due” for a tails.
-
___________ effect: If see a long string, assume the process can’t
be random
-
– If basketball player gets 5 baskets in a row, it can’t be random chance.
It must be that he is “___________”
Insensitivity to ___________________
-
Example: SAT predicts college GPA, but imperfectly.
-
Chris has a 1500 SAT. What is her college GPA?
-
A 1500 SAT is representative of a very high GPA, so might predict, say,
3.9 GPA.
-
But, SAT is only moderately correlated with GPA, so prediction should be
more _______ (closer to __________).
Misconceptions of Regression
-
Failure to understand ___________________________
-
Think that any one observation should be representative of the population
and therefore similar to the next observation.
-
In actual fact, an extreme observation is likely to be followed by a less
extreme observation.
______________ Neglect
-
A cab company was involved in a hit and run accident at night. Two
cab companies, the Green and the Blue, operate in the city. You are
given the following data:
-
85% of the cabs in the city are Green, 15% are Blue
-
A witness identified the cab as Blue. The court tested the reliability
of the witness under the same circumstances that existed on the night of
the accident and concluded that the witness correctly identified each one
of the two colors 80% of the time and failed 20% of the time (e.g., p(say
blue/really blue) = 0.80).
-
What is the probability that the cab involved in the accident was Blue
rather than Green?
Normative Solution
|
Blue |
Green |
Total |
| "blue" |
|
|
|
| "green" |
|
|
|
| Total |
|
|
100 |
p(Blue/"blue") = _________________________
Base Rate Neglect
-
Most people give estimates higher than ________%.
-
Accident caused by blue cab is representative of the witness testimony.
-
Neglects the low base rate of blue cabs.
Mammography Problem
-
The probability of breast cancer is 1% for a woman at age 40. If
a woman has breast cancer, the probability is 80% that she will have a
positve mammogram. If a woman does not have breast cancer, the probability
is 9.6% that she will also have a positive mammogram.
-
A 40-year-old woman has a positive mammogram. What is the probability
that she has breast cancer?
Bayes’ Theorem
p(H/E) =
p(H)p(E/H)
.
p(H)p(E/H) + p(~H)p(E/~H)
-
Prior probability
-
Hit rate (aka sensitivity)
-
False alarm rate (1 – specificity)
-
Posterior probability
Mammography Problem
-
Prior probability: p(ca) = ______
-
True positive rate: p(+mam/ca) = _______
-
False positive rate: p(-mam/ca) = 0.096
Bayes’ Theorem
p(ca/+mam) =
p(ca)p(+mam/ca)
.
p(ca)p(+mam/ca) + p(~ca)p(+mam/~ca)
(0.01)(0.80)
.
(0.01)(0.80 + (0.99)(0.096)
= 0.077 = 7.7%
Bayes’ Theorem
p(H/E) =
p(H)p(E/H)
.
p(H)p(E/H) + p(~H)p(E/~H)
-
Prior probability
-
Hit rate (aka sensitivity)
-
False alarm rate (1 – specificity)
How does each one affect the posterior probability?
-
What if p(H) is much higher (or lower)?
-
What if p(H) = 1.0?
-
What if p(E/~H) = 0.0?
-
Impossible to have gotten this evidence if hypothesis were false
-
– E.g., women without breast cancer never get a positive mammogram – no
false positive.
Frequency Version
-
10 of 1000 women at age 40 have breast cancer. 8 of these 10 women
with breast cancer will get a positive mammogram. Of the 990 women
without breast cancer, 95 will also get a positive mammogram.
-
Of women who get positive mammograms, how many have breast cancer?
Easier Solution
|
cancer |
no cancer |
Total |
| + mammogram |
|
|
|
| - mammogram |
|
|
|
| Total |
|
|
|
Why is frequency version easier?
-
Gigerenzer argues that cognitive system is set up to expect certain inputs
-
Frequencies are more ______________________ valid
-
Critic could argue that frequency version is easier because __________________________________
-
In a way, this is Gigerenzer’s point: cognitive system has the ___________________
concept, but not good at doing the math, given resources available.
-
Therefore, develop heuristics that deal with frequencies, and store information
that way.