Thursday, 26 September 2013

The Mathematics of Dutch Book Arguments

Dutch Book arguments purport to establish norms that govern credences (that is, numerically precise degrees of belief).  For instance, the original Dutch Book argument due to Ramsey and de Finetti aims to establish Probabilism, the norm that says that an agent's credences ought to obey the axioms of mathematical probability.  And David Lewis' diachronic Dutch Book argument aims to establish Conditionalization, the norm that says that an agent ought to plan to update in the light of new evidence by conditioning on it.  As we will see in this post, there is also a Dutch Book argument for the Principal Principle as well, the norm that says that an agent ought to defer to the chances when she sets her credences.  We'll look at each of these arguments below.

Each argument consists of three premises.  The second is always a mathematical theorem (sometimes known as the conjunction of the Dutch Book Theorem and the Converse Dutch Book Theorem).  My aim in this post is to present a particularly powerful way of thinking about the mathematics of these theorems.  It is due to de Finetti.  It is appealing for a number of reasons:  it is geometrical, so we can illustrate the theorems visually; it is uniform across the three different Dutch Book arguments we will consider here; and it establishes both Dutch Book Theorem and Converse Dutch Book Theorem on the basis of the same piece of mathematics.

I won't assume much mathematics in this post.  A passing acquaintance with vectors in Euclidean space might help, but it certainly isn't a prerequisite.

The form of a Dutch Book argument


The three premises of a Dutch Book argument for a particular norm $N$ are as follows:

(1) An account of the sorts of decisions a given set of credences will (or should) lead an agent to make.

(2) A mathematical theorem showing two things:  (i) relative to (1), credences that violate norm $N$ will lead an agent to make decisions with property $C$; (ii) relative to (1), credences that satisfy norm $N$ in question will not lead an agent to make decisions with this property $C$.

(3) A norm of practical rationality that says that, if an agent can avoid making decisions with property $C$, she is irrational if she does make such a decision.

In this post, I'll present Dutch Book arguments of this form for Probabilism, Conditionalization, and the Principal Principle.  But I'll be focussing on premise (2) in each case.  There's plenty to say about premises (1) and (3), of course.  But that's for another time.

The Dutch Book argument for Probabilism


The first premise in each Dutch Book argument is the same.  It has two parts:  the first tells us, for any proposition in which the agent has a credence, the fair price she ought to pay for a bet on that proposition; the second tells us the price she ought to pay for a book of bets on a number of different propositions given the price she's prepared to pay for each individual bet.  Thus, we have

(1a)  If an agent has credence $p$ in proposition $X$, she ought to pay $pS$ for a bet that pays out $S$ if $X$ is true and $0$ if $X$ is false.  (In such a bet, $S$ is called the stake.)

(1b) If an agent ought to pay $X$ for Bet 1 and $Y$ for Bet 2, she ought to pay $X+Y$ for a book consisting of Bet 1 and Bet 2.  (This is sometimes called the Package Principle.)

Putting these together, we get the following:  Suppose $\mathcal{F} = \{X_1, \ldots, X_n\}$ is a set of propositions.  And suppose we represent our agent's credences in these $n$ propositions by a vector \[ c = (c_1, \ldots, c_n) \] where $c_i$ is her credence in $X_i$.  And suppose we consider a book of bets $S$ in which the stake on $X_i$ is $S_i$.  Then we can represent this book by the vector \[ S = (S_1, \ldots, S_n) \] Then the price that the agent ought to pay for this book of bets is \[ \sum^n_{i=1} S_ic_i :=  (S_1, \ldots, S_n) \cdot (c_1, \ldots, c_n) = S\cdot c \] where $S\cdot c$ is the dot product of $c$ and $S$ considered as vectors.

Happily, there is also a nice way to represent the payoff of a book of bets $S$ at a given possible world $w$.  Represent that possible world $w$ by the following vector: \[ w = (w_1, \ldots, w_n) \] where $w_i = 1$ if $X_i$ is true at $w$ and $w_i = 0$ if $X_i$ is false at $w$.  Then the payoff of $S$ at $w$ is \[\sum^n_{i=1} S_iw_i := (S_1, \ldots, S_n) \cdot (w_1, \ldots, w_n) : S\cdot w \]  As we will see, these vector representations will prove very useful below.

In this section, we're looking at the Dutch Book argument for Probabilism.

Probabilism  It ought to be that a set of credences $c$ obeys the axioms of mathematical probability.

Let us turn to premise (3) of this argument.  It says that it is irrational for an agent to have credences that lead her to make decisions that will lose her money in every world that she considers possible.  Now, a book of bets loses an agent money if \[\mbox{Payoff} < \mbox{Price}\] But recall from above:  the payoff of a book of bets $S$ at a world $w$ is $S \cdot w$; and the price of that book is $S \cdot c$.  Thus, the agent is irrational if there is a book $S$ such that \[S \cdot w < S \cdot c\] for all worlds $w$.  Equivalently, $S \cdot (w-c) < 0$ for all $w$.

So the Dutch Book Theorem (that is, premise (2)) can be stated as follows:

Theorem 1
(i) If $c$ violates Probabilism, then there is a book $S$ such that $S \cdot w < S \cdot c$ for all worlds $w$ (equivalently, $S \cdot (w-c) < 0$ for all $w$).
(ii) If $c$ satisfies Probabilism, then there is no book $S$ such that $S \cdot w \leq S \cdot c$ (equivalently, $S\cdot (w-c) \leq 0$) for all worlds $w$ and $S \cdot w < S \cdot c$ (equivalently, $S \cdot (w-c) < 0$) for some world $w$.

We now turn to the proof of this theorem.  It is based on two pieces of mathematics:  the first involves some basic geometrical facts about the dot product; the second involves a neat geometric characterization of the credences that satisfy Probabilism.

First, a well known fact about the dot product.  If $u$ and $v$ are vectors in $\mathbb{R}^n$, we have \[ u \cdot v = ||u||\, ||v|| cos \theta\] where $\theta$ is the angle between $u$ and $v$.  Since $||u||\, ||v|| \geq 0$, we have \[u\cdot v < 0 \Leftrightarrow cos \theta < 0\]  And, by basic trigonometry, we have \[u \cdot v < 0 \Leftrightarrow \frac{\pi}{2} < \theta < \frac{3\pi}{2}\]  Thus:
  • To prove Theorem 1(i), it suffices to show that, if $c$ violates Probabilism, we can find a vector $S$ such that the angle between $S$ and $w-c$ is oblique for all worlds $w$.
  • To prove Theorem 1(ii), it suffices to show that, if $c$ satisfies Probabilism, there is no vector $S$ such that the angle between $S$ and $w-c$ is oblique or right for all $w$ and oblique for some $w$.
To do this, we need a geometric characterization of the credences that satisfy Probabilism.  Fortunately, we have that in the following lemma due to de Finetti:

Lemma 1 $c$ satisfies Probabilism iff $c \in \{w : w \mbox{ is a possible world}\}^+$.

where, if $\mathcal{X}$ is a set of vectors in $\mathbb{R}^n$, $\mathcal{X}^+$ is the convex hull of $\mathcal{X}$:  that is, $\mathcal{X}^+$ is the smallest convex set that includes $\mathcal{X}$; if $\mathcal{X}$ is finite, then $\mathcal{X}^+$ is the set of linear combinations of elements of $\mathcal{X}$.

Thus, Lemma 1 says that the vectors that represent the probabilistic sets of credences are precisely those that belong to the convex hull of the vectors that represent the possible worlds.

How does this help? Let's take the case in which $c$ violates Probabilism.  That is, $c$ lies outside the convex hull of the vectors representing the different possible worlds.  Then it is easy to see from Figure 1 below that there is a vector $c^*$ that lies inside that convex hull such that, for a given world $w$, the angle $\theta$ between the vector $c-c^*$ and the vector $w-c$ is oblique.  Thus, if we let $S = c - c^*$, we have Theorem 1(i).

Figure 1: The oval represents the convex hull of the set of vectors that represent the different possible worlds.  If $c$ violates Probabilism, then it lies outside this.  But, by a Hyperplane Separating Theorem, there is a point $c^*$ in the convex hull such that the angle between $c-c^*$ and $x-c$ is oblique for any $x$ inside the convex hull.  Thus, in particular, it is oblique when $x$ is a vector representing a possible world, as required.

Now let's take the case in which $c$ satisfies Probabilism.  That is, $c$ lies inside the convex hull of the vectors representing the different possible worlds.  Then it is easy to see from Figure 2 below that, if $S$ is a vector, then while there may be some worlds $w$ such that the angle $\theta$ between $S$ and $w-c$ is oblique, there must also be some worlds $w'$ such that the angle $\theta'$ between $S$ and $w'-c$ is acute.  Alternatively, it is possible that the angles $\theta$ between $S$ and $w-c$ for all worlds $w$ are all right.

Figure 2: Again, the oval represents the convex hull of the possible worlds.  If $c$ satisfies Probabilism, then it lies inside.
This completes the geometrical proof of Theorem 1, which combines the Dutch Book Theorem and the Converse Dutch Book Theorem.

The Dutch Book Argument for the Principal Principle


The Principal Principle says, roughly, that an agent ought to defer to the chances when she sets her credences.  One natural formulation of this (explicitly proposed by Jenann Ismael and entailed by a slightly stronger formulation proposed by David Lewis) is this:

Principal Principle  It ought to be the case that $c$ is in $\{ch : ch \mbox{ is a possible chance function}\}^+$.

That is, the Principal Principle says that one's credence function ought to be a linear combination of the possible chance functions.

Now, adapting the proof of Theorem 1 above, replacing the possible worlds $w$ by possible chance functions $ch$ (represented as vectors in the natural way), we easily prove the following:

Theorem 2
(i) If $c$ violates the Principal Principle, then there is a book $S$ such that $S \cdot ch < S \cdot c$ for all possible chance functions $ch$.
(ii) If $c$ satisfies Probabilism, then there is no book $S$ such that $S \cdot ch \leq S \cdot c$ for all possible chance functions $ch$ and $S \cdot ch < S \cdot c$ for some possible chance function $ch$.

But what does this tell us?  Well, as before, $S \cdot c$ is the price our agent would pay for the book $S$.  But this time, the other side of the inequality is $S\cdot ch$.  And this, it turns out, is the objective expected payout of $S$, rather than the actual payout of $S$.  Thus, violating the Principal Principle does not necessarily make an agent vulnerable to a true Dutch Book.  But it does lead them to pay a price for a book of bets that is higher than the objective expected value of that book, according to all of the possible chance functions.  And this, we might think, is irrational.  For one thing, such an agent will, with objective chance 1, lose money in the long run.  Thus, in the Dutch Book argument for the Principal Principle, premise (1) is as before, premise (2) is Theorem 2, but premise (3) becomes the following:  It is irrational for an agent to have credences that lead her to pay more than the objective expected value for a book of bets.

The Dutch Book Argument for Conditionalization


Conditionalization is the following norm:

Conditionalization  Suppose our agent has credence $c$ at $t$; and suppose she knows that, by $t'$, she will have received evidence from the partition $E_1, \ldots, E_m$.  And suppose she plans to update as follows:  If $E_i$, then $c_i$.  Then it ought to be that $c_i(-) = c(-|E_i)$ for $i = 1, \ldots, m$.

In fact, the Dutch Book argument for Conditionalization that we will present is primarily a Dutch Book argument for van Fraassen's Reflection Principle, which is equivalent to Conditionalization.  The Reflection Principle says the following:

Reflection Principle  Suppose our agent has credence $c$ at $t$; and suppose she knows that, by $t'$, she will have received evidence from the partition $E_1, \ldots, E_m$.  And suppose she plans to update as follows:  If $E_i$, then $c_i$.  Then it ought to be that:
(i) $c_i(E_i) = 1$ for $i = 1, \ldots, m$;
(ii) $c$ is in $\{c_i : i = 1, \ldots, m\}^+$.

 That is, Reflection says that an agent's current credences ought to be a mixture of her planned future credences.  Since Reflection and Conditionalization are equivalent, it suffices to establish Reflection.

Here is the theorem that provides the second premise of the Dutch Book argument for Reflection:

Theorem 3
(i) Suppose $c, c_1, \ldots, c_n$ violate Reflection.  Then there are books $S, S_1, \ldots, S_m$ such that (a) for all $i = 1, \ldots, m$, \[ S \cdot (w - c) + S_i(w - c_i) \leq 0 \] for all worlds $w$ in $E_i$; and (b) for some $i = 1, \ldots, m$, \[ S \cdot (w - c) + S_i(w - c_i) < 0 \] for some world $w$ in $E_i$.
(ii) Suppose $c, c_1, \ldots, c_n$ satisfy Reflection.  Then there are no books $S, S_1, \ldots, S_m$ such that (a) for all $i = 1, \ldots, m$, \[S \cdot (w- c) + S_i\cdot (w-c_i) \leq 0\] for all worlds $w$ in $E_i$; and (b) there is $i = 1, \ldots, m$ such that \[S \cdot (w- c) + S_i\cdot (w-c_i) < 0 \] for some $w$ in $E_i$.

What does this say?  It says that, if you plan to update in some way other than conditioning on your evidence, and thereby violate Reflection, there is a book $S$ that you will accept at $t$ as well as, for each $E_i$, a book $S_i$ that you will accept at $t'$ if you learn $E_i$ such that, together, they will guarantee you a loss.  And this will not happen if you plan to update by conditioning.

How do we prove this?  Theorem 3(i) is the easier to prove.  Suppose $c, c_1, \ldots, c_n$ violate Reflection.  First, suppose that this is because $c_i(E_i) < 1$.  Then let $S = 0$ and $S_j = 0$ for all $j \neq i$.  And let $S_i$ be the book consisting only of a bet on $E_i$ with stake $-1$.  Then \[ S \cdot (w-c) + S_i(w-c_i) = (-1)(1 - c_i(E_i)) < 0\] for all worlds $w$ in $E_i$.  And \[ S \cdot (w-c) + S_i(w-c_i) = 0\] for all worlds $w$ in $E_j \neq E_i$.

Second, suppose that $c_i(E_i) = 1$ for all $i = 1, \ldots, m$.  But suppose $c$ is not inside the convex hull of the $c_i$s.  So $c, c_1, \ldots, c_n$ violate Reflection.  Then, adapting the proof of Theorem 1 by replacing the worlds $w$ with the planned posterior credences $c_i$, we get that there is a book $S$ such that \[ S \cdot (c_i - c) < 0\] for all $i = 1, \ldots, m$.  So if we let $S_i = -S$ for all $i = 1, \ldots, m$, we get \[ 0 > S \cdot (c_i - c) = S \cdot (w-c) + (-S)\cdot (w-c_i) = S \cdot (w-c) + S_i \cdot (w-c_i) \] for all worlds $w$.  This completes the proof of Theorem 3(i).

Now we turn to Theorem 3(ii).  Suppose $c, c_1, \ldots, c_n$ satisfy Reflection. Suppose, for a contradiction, that we have (a) for all $i = 1, \ldots, m$, \[S \cdot(w-c) + S_i \cdot(w-c_i) \leq 0 \] for all $w$ in $E_i$; and (b) for some $i = 1, \ldots, m$, \[S \cdot(w-c) + S_i \cdot(w-c_i) < 0 \] for some $w$ in $E_i$.  Our plan is to use this to construct $S'$ such that (a) for all $i = 1, \ldots, m$, \[S' \cdot(w-c) \leq 0\] for all $w$ in $E_i$; and (b) for some $i = 1, \ldots, m$, \[S' \cdot(w-c) < 0\] for some $w$ in $E_i$.  And we know that this is impossible from Theorem 1(ii).

We construct $S'$ as follows: First, suppose that $X_1, \ldots, X_k$ are the atoms of the algebra $\mathcal{F} = \{X_1, \ldots,  X_k, \ldots, X_n\}$.  Then notice that for each book of bets \[S = (S_1, \ldots, S_n)\] on the propositions $X_1, \ldots, X_n$, there is a book \[S^A = (S^A_1, \ldots, S^A_k, 0, \ldots, 0)\] on the atoms $X_1, \ldots, X_k$ of $\mathcal{F}$ such that $S^A$ is equivalent to $S$:  that is, the payout of $S$ is the same as the payout of $S^A$ at every world; and the price that a probabilistic agent should pay for $S^A$ is exactly the price she should pay for $S$.  Thus, if we have $S \cdot(w-c) + S_i \cdot(w-c_i) \leq 0$, then we have $S^A \cdot(w-c) + S^A_i \cdot(w-c_i) \leq 0$, and so on.  Thus, in what follows, we can assume without loss of generality that $S$ is a book of bets only on the atoms of $\mathcal{F}$.  Then we define $S'$ as follows, where for any atom $X_j$, we write $E_{i_j}$ for the cell of the partition in which $X_j$ lies:
\[
S'(X_j) := S(X_j) + S_{i_j}(X_j) - \sum_{X_k \in E_{i_j}} c(X_k | E_{i_j}) S_{i_j}(X_k)
\]
Then we can show that,
\[
S'\cdot(w - c) = S\cdot(w - c) + S_i\cdot (w - c_i)
\]
for all $E_i$ and $w \in E_i$.  Suppose $w$ is a world; suppose $X_j$ is the atom that is true at that world; suppose, as above, that $X_j$ lies in cell $E_{i_j}$.  Then we have
\begin{eqnarray*}
S'\cdot(w - c) & = & S(X_j) + S_{i_j}(X_j) - \sum_{X_k \in E_{i_j}} c(X_k | E_{i_j}) S_{i_j}(X_k) \\
& & \ \ \ - \sum_{X_l} c(X_l) [S(X_l) + S_{i_l}(A_l) + \sum_{X_k \in E_{i_l}} c(X_k | E_{i_l}) S_{i_l}(X_k)]\\
& = & S \cdot w + S_{i_j}\cdot w - S_{i_j} \cdot c_{i_j} - S \cdot c \\
 & & \ \ \ - \sum_{X_l} c(X_l) [S_{i_l}(A_l) + \sum_{X_k \in E_{i_l}} c(X_k | E_{i_l}) S_{i_l}(X_k)]\\
& = & S \cdot (w - c) + S_{i_j}\cdot (w - c_{i_j}) \\
 & & \ \ \ - \sum_{X_l} c(X_l) [S(X_l) + S_{i_l}(A_l) + \sum_{X_k \in E_{i_l}} c(X_k | E_{i_l})\\
& = & S \cdot (w - c) + S_{i_j}\cdot (w - c_{i_j}) < 0
\end{eqnarray*}
by assumption.  This completes the proof of Theorem 3(ii) and thus Theorem 3.

No comments:

Post a Comment