Reasoning about Uncertainty
|
6.8 When Conditioning Is Appropriate
Why is it that in the system
In all of these cases, roughly speaking, there is a naive space and a sophisticated space. For example, in both
The naive space is typically smaller and easier to work with than the sophisticated space. Indeed, it is not always obvious what the sophisticated space should be. For example, in the second-ace puzzle, the story does not say what protocol Alice is using, so it does not determine a unique sophisticated space. On the other hand, as these examples show, working in the naive space can often give incorrect answers. Thus, it is important to understand when it is "safe" to condition in the naive space.
Consider the systems
Fix a system
For the remainder of this section, assume that the observations oi are subsets of W and that they are accurate, in that if agent 1 observes U at the point (r, m), then σ(r) ∈ U. Thus, at every round of every run of
This is exactly the setup in the Listener-Teller example (where the sets U have the form W −{w} for some w ∈ W ). It also applies to Example 3.1.2, the second-ace puzzle, the Monty Hall puzzle, and the three-prisoners puzzle from Example 3.3.1. For example, in the Monty Hall Puzzle, if W = {w1, w2, w3}, where wi is the worlds where the car is behind door i, then when Monty opens door 3, the agent essentially observes {w1, w2} (i.e., the car is behind either door 1 or door 2).
Let (
In Section 3.1, I discussed three conditions for conditioning to be appropriate. The assumptions I have made guarantee that the first two hold: agent 1 does not forget and what agent 1 learns/observes is true. That leaves the third condition, that the agent learns nothing from what she observes beyond the fact that it is true. To make this precise, it is helpful to have some notation. Given a local state ℓ = 〈U1,…, Um〉 and U ⊆ W, let
Intuitively, (6.1) says that in local state ℓ, observing U (which results in local state ℓ U) has the same effect as discovering that U is true, at least as far as the probabilities of subsets of W is concerned.
The following theorem makes precise that (6.1) is exactly what is needed for conditioning to be appropriate:
Theorem 6.8.1
Suppose that r1(m + 1) = r1(m) U for r ∈
-
if μWr, m, 1(U) > 0, then μWr, m+1, 1 = μWr,m, 1( | U);
-
if μ(
[r1(m) U]) > 0, then for all V ⊆ W ;
-
for all w1, w2 ∈ U, if μ(
[r1(m)] ∩ [wi]) > 0 for i = 1, 2, then -
for all w ∈ U such that μ(
[r1(m)] ∩ [w]) > 0, -
The event
[w] is independent of the event [r1(m) U], given [r1(m)] ∩ [U].
Proof See Exercise 6.13.
Part (a) of Theorem 6.8.1 says that conditioning in the naive space agrees with conditioning in the sophisticated space. Part (b) is just (6.1). Part (c) makes precise the statement that the probability of learning/observing U is the same at all worlds compatible with U that the agent considers possible. The condition that the worlds be compatible with U is enforced by requiring that w1, w2 ∈ U; the condition that the agent consider these worlds possible is enforced by requiring that μ(
Part (c) of Theorem 6.8.1 gives a relatively straightforward way of checking whether conditioning is appropriate. Notice that in
Theorem 6.8.1 explains why naive conditioning does not work in the second-ace puzzle and the Monty Hall puzzle. In the second-ace puzzle, if Alice tosses a coin to decide what to say if she has both aces, then she is not equally likely to say "I have the ace of spades" at all the worlds that Bob considers possible at time 1 where she in fact has the ace of spades. She is twice as likely to say it if she has the ace of spades and one of the twos as she is if she has both aces. Similarly, if Monty chooses which door to open with equal likelihood if the goat is behind door 1, then he is not equally likely to show door 2 in all cases where the goat is not behind door 2. He is twice is likely to show door 2 if the goat is behind door 3 as he is if the goat is behind door 1.
The question of when conditioning is appropriate goes far beyond these puzzles. It turns out that to be highly relevant in the statistical areas of selectively reported data and missing data. For example, consider a questionnaire where some people answer only some questions. Suppose that, of 1,000 questionnaires returned, question 6 is answered "yes" in 300, "no" in 600, and omitted in the remaining 100. Assuming people answer truthfully (clearly not always an appropriate assumption!), is it reasonable to assume that in the general population, 1/3 would answer "yes" to question 6 and 2/3 would answer "no"? This is reasonable if the data is "missing at random," so that people who would have said "yes" are equally likely not to answer the question as people who would have said "no." However, consider a question such as "Have you ever shoplifted?" Are shoplifters really just as likely to answer that question as nonshoplifters?
This issue becomes particularly significant when interpreting census data. Some people are invariably missed in gathering census data. Are these people "missing at random"? Almost certainly not. For example, homeless people and people without telephones are far more likely to be underrepresented, and this underrepresentation may skew the data in significant ways.
|