Reasoning about Uncertainty
|
5.1 Expectation for Probability Measures
For definiteness, suppose that 1,000 lottery tickets are sold, numbered 1 through 1,000, and both prizes are guaranteed to be awarded. A world can be characterized by three numbers (a, b, c), each between 1 and 1,000, where a and b are the ticket numbers that are awarded first and second prize, respectively, and c is Alice's ticket number. Suppose that at most one prize is awarded per ticket, so that a ≠ b. The amount of money that Alice wins in the lottery can be viewed as a random variable on this set of possible worlds. Intuitively, the amount that Alice can expect to win is the amount she does win in each world (i.e., the value of the random variable in each world) weighted by the probability of that world. Note that this amount may not match any of the amounts that Alice actually could win. In the case of the lottery, if all tickets are equally likely to win, then the expected amount that Alice can win, according to this intuition, is $1: 999 out of 1,000 times she gets nothing, and 1 out of 1,000 times she gets $1,000. However, she never actually wins $1. It can be shown that, if she plays the lottery repeatedly, then her average winnings are $1. So, in this sense, her expected winnings say something about what she can expect to get in the long run.
The intuition that Alice's expected winnings are just the amount she wins in each world weighted by the probability of that world can be easily formalized, using the notion of the expected value of a gamble. (Recall that a gamble is a real-valued random variable.) If W is finite and every set (and, in particular, every singleton set) is measurable, then the expected value of the gamble X (or the expectation of X) with respect to a probability measure μ, denoted Eμ(X), is just
Thus, the expected value of a gamble is essentially the "average" value of the variable. More precisely, as I said earlier, it is its value in each world weighted by the probability of the world.
If singletons are not necessarily measurable, the standard assumption is that X is measurable with respect to
It is easy to check that (5.1) and (5.2) are equivalent if all singletons are measurable and W is finite (Exercise 5.1). However, (5.2) is more general. It makes sense even if W is not finite, as long as
There are a number of other expressions equivalent to (5.2). I focus on one here. Suppose that
(Exercise 5.2). A variant of (5.3), which essentially starts at the top and works down, is considered in Exercise 5.3.
What is the point of considering a definition of expectation like (5.3), given that it is equivalent to (5.2)? As long as only probability is considered, there is perhaps not much point. But analogues of these expressions for other representations of uncertainty are not, in general, equivalent. I return to this point in Section 5.2.2.
I conclude this section by listing some standard properties of expectation that will be useful in comparing expectation for probability with expectation for other forms of uncertainty. If X and Y are gambles on W and a and b are real numbers, define the gamble aX + bY on W in the obvious way: (aX + bY)(w) = aX(w) + bY(w). Say that X ≤ Y if X(w) ≤ Y(w) for all w ∈ W. Let
Proposition 5.1.1
The function Eμ has the following properties for all measurable gambles X and Y.
-
Eμ is additive: Eμ(X + Y) = Eμ(X) + Eμ(Y).
-
Eμ is affinely homogeneous: Eμ(aX +
) = aEμ(X) + b for all a, a, b ∊ ℝ. -
Eμ is monotone: if X ≤ Y, then Eμ(X) ≤ Eμ(Y).
Proof See Exercise 5.4.
The next result shows that the properties in Proposition 5.1.1 essentially characterize probabilistic expectation. (Proposition 5.1.1 is not the only possible characterization of Eμ. An alternate characterization is given in Exercise 5.5.)
Proposition 5.1.2
Suppose that E maps gambles that are measurable with respect to
Proof The proof is quite straightforward; I go through the details here just to show where all the assumptions are used. If U ∊
Thus, μ is indeed a probability measure.
To see that E = Eμ, note that it is immediate from (5.2) that μ(U) = Eμ(XU) for U ∊
Clearly, if μ(U) ≠ μ′(U), then Eμ(XU) ≠ Eμ′(XU). Thus, μ is the unique probability measure on
If μ is countably additive and W is infinite, then Eμ has a continuity property that is much in the spirit of (2.1):
(Exercise 5.6). (X1, X2, … is increasing to X if, for all w ∈ W, X1(w) ≤ X2(w) ≤ … and limi→∞ Xi(w) = X(w).) This property, together with the others in Proposition 5.1.2, characterizes expectation based on a countably additive probability measure (Exercise 5.6). Moreover, because Eμ(−X) =−Eμ(x), and X1, X2, … decreases to X iff −X1, −X2, … increases to −X, it is immediate that the following continuity property is equivalent to (5.4):
|