Joyce's argument for Probabilism

In January, the Department of Philosophy at the University of Bristol launched an ERC-funded four-year research project on Epistemic Utility Theory: Foundations and Applications.  The main researchers will be: Richard Pettigrew, Jason Konek, Ben Levinstein, Pavel Janda (PhD student), and Chris Burr (PhD student).  The website is here.

I thought it would be good to write a few blog posts explaining what I take epistemic utility theory to be, and describing the work that has been done in the area so far.  So, over the next few weeks, that's exactly what I'll do here at M-Phi.  I'll try for one post per week.

The guiding idea behind epistemic utility theory is this:  Over the last decade or so, epistemologists have been increasingly interested in epistemic value.  That is, they have been interested in identifying the features of a doxastic or credal state that make it good qua cognitive state (rather than good qua guide to action).  For instance, we might say that having true beliefs is more valuable than having false beliefs, or that having higher credences in true propositions is better; or we might say that a belief or a credence has greater value the greater its evidential support.  Epistemic utility theory begins by asking a further question:  How can we quantify and measure epistemic value?  Having answered that question it asks another:  What epistemic norms can be justified by appealing to this measure of epistemic value?

Joyce's framework


The original argument in this area is due to Jim Joyce in his paper 'A Non-Pragmatic Vindication of Probabilism' (1998) Philosophy of Science 65(4):575-603.  In this blog post, I'll describe the framework in which Joyce's argument takes place; I'll state the norm he wishes to justify; and I'll present his argument for it in the way I find most plausible.

Represent an agent's cognitive state at a given time by her credence function at that time:  this is the function c that takes each proposition about which she has an opinion and returns the real number that measures her credence in that proposition.  By convention, we represent minimal credence by 0 and maximal credence by 1.  Thus, $c$ is defined on the set $\mathcal{F}$ of propositions about which the agent has an opinion; and it takes values in $[0, 1]$.  If $X$ is in $\mathcal{F}$, then $c(X)$ is our agent's degree of belief or credence in $X$.  Throughout, we assume that $\mathcal{F}$ is finite.  With this framework in hand, we can state the norm of Probabilism:

Probabilism At any time in an agent's credal life, it ought to be the case that her credence function $c$ at that time is a probability function over $\mathcal{F}$ (or, if $\mathcal{F}$ is not an algebra, $c$ can be extended to a probability function over the smallest algebra that contains $\mathcal{F}$).

Joyce's argument


How do we establish this norm?  Jim Joyce offers the following argument:  It is often said that the aim of full belief is truth.  One way to make this precise is to say that the ideal doxastic state is that in which one believes every true proposition about which one has an opinion, and one disbelieves every false proposition about which one has an opinion.  That is, the ideal doxastic state is the omniscient doxastic state (relative to the set of propositions about which one has an opinion).  We might then measure how good an agent's doxastic state is by its proximity to this omniscient state.

Joyce's argument, as I will present it, is based on an analogous claim about credences.  We say that the ideal credal state is that in which our agent assigns credence 1 to each true proposition in $\mathcal{F}$ and credence 0 to each false proposition in $\mathcal{F}$. By analogy with the doxastic case, we might call this the omniscient credal state (relative to the set of propositions about which she has an opinion). Let $\mathcal{W}$ be the set of possible worlds relative to $\mathcal{F}$:  that is, the set of consistent assignments of truth values to the propositions in $\mathcal{F}$.  Now, let $w$ be a world in $\mathcal{W}$.  Then let $v_w$ be the omniscient credal state at $w$: that is, $v_w(X) = 0$ if $X$ is false; $v_w(X) = 1$ if $X$ is true.

We then measure how good an agent's credal state is by its proximity to the omniscient state.  Following Joyce, we call this the accuracy of the credal state.  To do this, we need a measure of distance between credence functions.  Many different measures will do the job, but here I will focus on the most popular, namely, Squared Euclidean Distance.  Suppose $c$ and $c'$ are two credence functions.  Then define the Squared Euclidean Distance between them as follows:
\[
Q(c, c') := \sum_{X \in \mathcal{F}} (c(X) - c'(X))^2
\]
Thus, given a possible world $w$ in $\mathcal{W}$, the cognitive badness or disvalue of the credence function $c$ at $w$ is given by its inaccuracy; that is, the distance between $c$ and $v_w$, namely, $Q(c, v_w)$.  We call this the Brier score of $c$ at $w$, and we write it $B(c, w)$.  So the cognitive value of $c$ at $w$ is the negative of the Brier score of $c$ at $w$; that is, it is $-B(c, w)$.  Thus, $B$ is a measure of inaccuracy; $-B$ is a measure of accuracy.

With this measure of cognitive value in hand, Joyce argues for Probabilism by appealing to a standard norm of traditional decision theory:

Dominance Suppose $\mathcal{O}$ is a set of options, $\mathcal{W}$ is a set of possible worlds, and $U$ is a measure of the value of the options in $\mathcal{O}$ at the worlds in $\mathcal{W}$.  Suppose $o, o'$ in $\mathcal{O}$.  Then we say that
  • $o$ strongly $U$-dominates $o'$ if $U(o', w) < U(o, w)$ for all worlds $w$ in $\mathcal{W}$
  • $o$ weakly $U$-dominates $o'$ if $U(o', w) \leq U(o, w)$ for all worlds $w$ in $\mathcal{W}$ and $U(o', w) < U(o, w)$ for at least one world $w$ in $\mathcal{W}$.
Now suppose $o, o'$ in $\mathcal{O}$ and
  1. $o$ strongly $U$-dominates $o'$;
  2. There is no $o''$ in $\mathcal{O}$ that weakly $U$-dominates $o$.
Then $o'$ is irrational.

Of course, in standard decision theory, the options are practical actions between which we wish to choose.  For instance, they might be the various environmental policies that a government could pursue; or they might be the medical treatments that a doctor may recommend.  But there is no reason why Dominance or any other decision-theoretic norm can only determine the irrationality of such options.  They can equally be used to establish the irrationality of accepting a particular scientific theory or, as we will see, the irrationality of particular credal states.  When they are put to use in the latter way, the options are the possible credal states an agent might adopt; the worlds are, as above, the consistent assignments of truth values to the propositions in $\mathcal{F}$; and the measure of value is $-B$, the negative of the Brier score.  Granted that, which credal states does Dominance rule out?  As the following theorem shows, it is precisely those that violate Probabilism.

Theorem 1
  1. If $c$ is not a probability function, then there is a credence function $c^*$ that strongly Brier dominates $c$.
  2. If $c$ is a probability function, then there is no credence function $c^*$ that weakly Brier dominates $c$.
This, then, is Joyce's argument for Probabilism:
  1. The cognitive value of a credence function is given by its proximity to the ideal credence function:  the ideal credence function at world $w$ is $v_w$; and distance is measured by the Squared Euclidean Distance.  Thus, the cognitive value of a credence function at a world is given by the negative of its Brier score at that world.  (In fact, as we will see next week, Joyce weakens this premise and thus strengthens the argument.)
  2. Dominance
  3. Theorem 1
  4. Therefore, Probabilism
Thus, according to Joyce, what is wrong with an agent who violates Probabilism is that there is a credence function that is more accurate than hers regardless of how the world turns out.

Joyce's argument in action


Let's finish off by seeing the argument in action.  Suppose our agent has an opinion about only two propositions $A$ and $B$.  And suppose that $A$ entails $B$.  For such an agent, the only demand that Probabilism makes is

No Drop If $A$ entails $B$, an agent ought to have a credence function $c$ such that $c(A) \leq c(B)$.

Now, if $\mathcal{F} = \{A, B\}$, then $\mathcal{W} = \{w_1, w_2, w_3\}$, where $A$ and $B$ are both true at $w_1$, $A$ is false and $B$ is true at $w_2$, and $A$ and $B$ are both false at $w_3$.  Also, we can represent a credence function over these propositions as a point on the Euclidean plane.  So we can represent our agent's credence function $c$ like this, but also the omniscient credence functions at the three different possible worlds.  We do so on the diagram below.  On this diagram, the blue shaded area includes all and only the credence functions that satisfy Probabilism.  As we can see, if a credence function lies outside that area, there is a credence function that lies inside it that is closer to each omniscient credence function; but this never happens if the credence function is inside the area to begin with.  This is the content of Theorem 1 in this situation.

Comments