Thursday, 24 July 2014

Mathematicians' intuitions - a survey

I'm passing this on from Mark Zelcer (CUNY):

A group of researchers in philosophy, psychology and mathematics are requesting the assistance of the mathematical community by participating in a survey about mathematicians' philosophical intuitions. The survey is here: It would really help them if many mathematicians participated. Thanks!

Tuesday, 15 July 2014

Abstract Structure

Draft of a paper, "Abstract Structure", cleverly called that because it aims to explicate the notion of "abstract structure", bringing together some things I mentioned a few times previously.

Friday, 11 July 2014

Interview at 3am magazine

Here is the shameless self-promotion moment of the day: the interview with me at 3am magazine is online. I mostly talk about the contents of my book Formal Languages in Logic, and so cover a number of topics that may be of interest to M-Phi readers: the history of mathematical and logical notation, 'math infatuation', history of logic in general, and some more. Comments are welcome!

Thursday, 10 July 2014

Methodology in the Philosophy of Logic and Language

This M-Phi post is an idea Catarina and I hatched, after a post Catarina did a couple of weeks back at NewAPPS, "Searle on formal methods in philosophy of language", commenting on a recent interview of John Searle, where Searle comments that
"what has happened in the subject I started out with, the philosophy of language, is that, roughly speaking, formal modeling has replaced insight".
I commented a bit underneath Catarina's post, as this is one thing that interests me. I'm writing a more worked-out discussion. But because I tend to reject the terminology of "formal modelling" (note, British English spelling!), I have to formulate Searle's objection a bit differently. Going ahead a bit, his view is that:
the abstract study of languages as free-standing entities has replaced study of the psychology of actual speakers and hearers.
This is an interesting claim, impinging on the methodology of the philosophy of logic and language. I think the clue to seeing what the central issues are can be found in David Lewis's 1975 article, "Languages and Language" and in his earlier "General Semantics", 1970.

1. Searle

To begin, I explain problems (maybe idiosyncratic ones) I have with both of these words "formal" and "modelling".

1.a "formal"
By "formal", I normally mean simply "uninterpreted". So, for example, the uninterpreted first-order language $L_A$ of arithmetic is a formal language, and indeed a mathematical object. Mathematically speaking, it is a set $\mathcal{E}$ of expressions (finite strings from a vocabulary), with several distinguished operations (concatenation and substitution) and subsets (the set of terms, formulas, etc). But it has no interpretation at all. It is therefore formal. On the other hand, the interpreted language $(L_A, \mathbb{N})$ of arithmetic is not a "formal" language. It is an interpreted language, some of whose strings have referents and truth values! Suppose that $v$ is a valuation (a function from the variables of $L_A$ to the domain of $\mathbb{N}$), that $t$ is a term of this language and $\phi$ is a formula of this language. Then $t$ has a denotation $t^{\mathbb{N},v}$ and $\phi$ has a truth value $\mid \mid \phi \mid \mid_{\mathbb{N},v}$.

This distinction corresponds to what Catarina calls "de-semantificaiton" in her article "The Different Ways in which Logic is (said to be) Formal" (History and Philosophy of Logic, 2011). My use of "formal" is always "uninterpreted". So, $L_A$ is a formal language, while $(L_A, \mathbb{N})$ is not a "formal" language, but is rather an interpreted language, whose intended interpretation is $\mathbb{N}$. (The intended interpretation of an interpreted language is built-into the language by definition. There is no philosophical problem of what it means to talk about the intended interpretation of an interpreted language. It is no more conceptually complicated that talking about the distinguished order $<$ in a structure $(X,<)$.)

1.b "modelling"
But my main problem is with this Americanism, "modelling", which I seem to notice all over the place. It seems to me that there is no "modelling" involved here, unless it is being used to involve a translation relation. For modelling itself, in physics, one might, for example, model The Earth as an oblate spheroid $\mathcal{S}$ embedded in $\mathbb{R}^3$. That is modelling. Or one might model a Starbucks coffee cup as a truncated cone embeddied in $\mathbb{R}^3$. Etc. But, in the philosophy of logic and language, I don't think we are "modelling": languages are languages, are languages, are languages ... That is, languages are not "models" in the sense used by physicists and others -- for if they are "models", what are they models of?

A model $\mathcal{A} = (A, \dots)$ is a mathematical structure, with a domain $A$ and some bunch of defined functions and relations on the domain. One can probably make this precise for the case of an oblate spheroid or a truncated cone; this is part of modelling in science. But in the philosophy of logic and language, when describing or defining a language, we not modelling.

But: I need to add that Catarina has rightly reminded me that some authors do often talk about logic and language in terms of "modelling" (now I should say "modeling" I suppose), and think of logic as being some sort of "model" of the "practice" of, e.g., the "working mathematician". A view like this has been expressed by John Burgess, Stewart Shapiro and Roy Cook. I am sceptical. What is a "practice"? It seems to be some kind of supra-human "normative pattern", concerning how "suitably qualified experts would reason", in certain "idealized circumstances". Personally, I find these notions obscure and unhelpful; and it all seems motivated by a crypto-naturalistic desire to remain in contact with "practice"; whereas, when I look, the "practice" is all over the place. When I work on a mathematics problem, the room ends up full of paper, and most of the squiggles are, in fact, wrong.

So, I don't think a putative logic is somehow to be thought of as "modelling" (or perhaps to be tested by comparing it with) some kind of "practice". For example, consider the inference,
$\forall x \phi \vdash \phi^x_t$
Is this meant to "model" a "practice"? If so, it must be something like this:
The practice wherein certain humans $h_1, \dots$ tend to "consider" a string $\forall x \phi$ and then "emit" a string $\phi^x_t$
And I don't believe there is such a "practice". This may all be a reflection of my instinctive rationalism and methodological individualism. If there are such "practices", then these are surely produced by our inner cognition. Otherwise, I have no idea what the scientifically plausible  mechanism behind a "practice" is.

Noam Chomsky of course long ago distinguished performance and competence (and before him, Ferdinand de Saussure distinguished parole and langue), and has always insisted that generative grammars somehow correspond to competence. If what is meant by "practice" is competence, in something like the Chomskyan sense, then perhaps that is the way to proceed in this direction. But in the end, I suspect that brings one back to the question of what it means to "speak/cognize a language", which is discussed below.

1.c Über-language 
On the other hand, when Searle mentions modelling, it is likely that he has the following notion in mind:
A defined language $L$ models (part of) English.
In other words, the idea is that English is basic and $L$ is a "tool" used to "model" English. But is English basic? I am sceptical of this, because there is a good argument whose conclusion denies the existence of English. Rather, there is an uncountable infinity of languages; many tens of millions of them, $L_1, L_2, \dots, L_{1000,000}, \dots$, are mutually similar, albeit heterogenous, idiolects, spoken by speakers, who succeed to high degree in mutual communication. Not any these $L_1, L_2, \dots, L_{1000,000}, \dots$ spoken by individual speakers is English. If one of these is English, then which one? The idiolect spoken by The Queen? Maybe the idiolect spoken by President Barack Obama? Michelle Obama? Maybe the idiolect spoken by the deceased Christopher Hitchens? Etc. The conclusion is that, strictly speaking, there is no such thing as English.

It seems the opposite is true: there is a heterogeneous speech community $C$ of speakers, whose members speak overlapping and similar idiolects, and these are to a high degree mutually interpretable. But here is no single "über-language" they all speak. By the same reasoning, one may deny altogether the existence of so-called "natural" languages. (Cf., methodological individualism in social sciences; also Chomsky's distinction between I-languages and E-languages.) There are no "natural" languages. There are languages; and there are speakers; and speakers speak a vast heterogeneous array of varying and overlapping languages, called idiolects.

1.d Methodology
Next Searle moves on to his central methodological point:
Any account of the philosophy of language ought to stick as closely as possible to the psychology of actual human speakers and hearers. And that doesn’t happen now. What happens now is that many philosophers aim to build a formal model where they can map a puzzling element of language onto the formal model, and people think that gives you an insight. … 
The point of disagreement here is again with the phrase "formal model", as the languages we study aren't formal models! The entities involved when we work in these areas are sometimes pairs of languages $L_1$ and  $L_2$ and the connection is not that $L_1$ is a "model" of $L_2$ but rather that "$L_1$ has certain translational relations with $L_2$". And translation is not "modelling". A translation is a function from the strings of $L_1$ to the strings of $L_2$ preserving certain properties. Searle illustrates his line of thinking by saying:
And this goes back to Russell’s Theory of Descriptions. … I think this was a fatal move to think that you’ve got to get these intuitive ideas mapped on to a calculus like, in this case, the predicate calculus, which has its own requirements. It is a disastrously inadequate conception of language.
But this seems to me an inadequate description of Russell's 1905 essay. Russell was studying the semantic properties of string "the" in a certain language English. (The talk of a "calculus" loads the deck in Searle's favour.) Russell does indeed translate between languages. For example, the string
(1) The king of France is bald
is translated to the string
(2) $\exists x(\text{king-of-Fr.}(x) \wedge \text{Bald}(x) \wedge \forall y(\text{king-of-Fr.}(y) \to y = x)).$
But this latter string (2) is not a "model", either of the first string (1), or of some underling "psychological mechanism".
… That’s my main objection to contemporary philosophy: they’ve lost sight of the questions. It sounds ridiculous to say this because this was the objection that all the old fogeys made to us when I was a kid in Oxford and we were investigating language. But that is why I’m really out of sympathy. And I’m going to write a book on the philosophy of language in which I will say how I think it ought to be done, and how we really should try to stay very close to the psychological reality of what it is to actually talk about things.
Having got this far, we reach a quite serious problem. There is, currently, no scientific understanding of "the psychological reality of what it is to actually talk about things". A cognitive system $C$ may speak a language $L$. How this happens, though, is anyone's guess. No one knows how it can be that
Prof. Gowers uses the string "number" to refer to the abstract object $\mathbb{N}$.
Prof. Dutilh Novaes uses the string "Aristotle" to refer to Aristotle.
SK uses the string "casa" to refer to his home.
Mr. Salmond uses the string "the referendum" to refer to the future referendum on Scottish independence.
The problem here is that there is no causal connection between Prof. Gowers and $\mathbb{N}$! Similarly, a (currently) future referendum (18 Sept 2014) cannot causally influence Mr. Salmond's present (10 July 2014) mental states. So, it is quite a serious puzzle.

2. Lewis

Methodologically, on such issues -- that is, in the philosophy of logic and language -- the outlook I adhere to is the same as Lewis's, whose view echoes that of Russell, Carnap, Tarski, Montague and Kripke. Lewis draws a crucial distinction:
(A) Languages (a language is an "abstract semantic system whereby symbols are associated with aspects of the world").
(B) Language as a social-psychological phenomenon.
With Lewis, I think it's important not to confuse these. In an M-Phi post last year (March 2013), I quoted Lewis's summary from his "General Semantics" (1970):
My proposals will also not conform to the expectations of those who, in analyzing meaning, turn immediately to the psychology and sociology of language users: to intentions, sense-experience, and mental ideas, or to social rules, conventions, and regularities. I distinguish two topics: first, the description of possible languages or grammars as abstract semantic systems whereby symbols are associated with aspects of the world; and second, the description of the psychological and sociological facts whereby a particular one of these abstract semantic systems is the one used by a person or population. Only confusion comes of mixing these two topics.
I will just call them (A) and (B). See also Lewis's "Languages and Language" (1975) for this distinction. Most work in what is called "formal semantics" is (A)-work. One defines a language $L$ and proves some results about it; or one defines two languages $L_1, L_2$ and proves results about how they're related. But this is (A)-work, not (B)-work.

3. (Syntactic-)Semantic Theory and Conservativeness

For example, suppose I decided I am interested in the following language $\mathcal{L}$: this language $\mathcal{L}$ has strings $s_1, s_2$, and a meaning function $\mu_{\mathcal{L}}$ such that,
$\mu_{\mathcal{L}}(s_1) = \text{the proposition that Oxford is north of Cambridge}$
$\mu_{\mathcal{L}}(s_2) = \text{the proposition that Oxford is north of Birmingham}$
Then this is in a deep sense logically independent of (B)-things. And one can, in fact, prove this!

First, let $L_O$ be an "empirical language", containing no terms for syntactical entities or semantic properties and relations. $L_O$ may contain terms and predicates for rocks, atoms, people, mental states, verbal behaviour, etc. But no terms for syntactical entities or semantic relations.

Second, we extend this observation language $L_O$ by adding:
  • the unary predicate "$x$ is a string in $\mathcal{L}$" (here "$\mathcal{L}$" is not treated as a variable), 
  • the constants "$s_1$", "$s_2$", 
  • the unary function symbol "$\mu_{\mathcal{L}}(-)$", 
  • the constants "the proposition that Oxford is north of Cambridge" and "the proposition that Oxford is north of Birmingham". 
Third, consider the following six axioms of semantic theory $ST$ for $\mathcal{L}$:
(i) $s_1$ is a string in $\mathcal{L}$.
(ii) $s_2$ is a string in $\mathcal{L}$.
(iii) $s_1 \neq s_2$.
(iv) the only strings in $\mathcal{L}$ are $s_1$ and $s_2$.
(v) $\mu_{\mathcal{L}}(s_2) = \text{the proposition that Oxford is north of Birmingham}$
(vi) $\mu_{\mathcal{L}}(s_1) = \text{the proposition that Oxford is north of Cambridge}$
Then, assuming $O$ is not too weak ($O$ must prove that there are at least two objects), for almost any choice of $O$ whatsoever,
$O+ST$ is a conservative extension of $O$.
To prove this, I consider any interpretation $\mathcal{I}$ for $L_O$, and I expand it to a model $\mathcal{I}^+ \models ST$. There are some minor technicalities, which I skirt over.

Consequently, the semantic theory $ST$ is neutral with respect to any observation claim: the semantic description of a language $\mathcal{L}$ is consistent with (almost) any observation claim. That is, the semantic description of a language $\mathcal{L}$ cannot be empirically tested, because it has no observable consequences.

(There are some further caveats. If the strings actually are physical objects, already referred to in $L_O$, then this result may not quite hold in the form stated. Cf., the guitar language.)

4. The Wittgensteinian View

Lewis's view can be contrasted with a Wittgensteinian view, which aims to identify $(A)$ and $(B)$ very closely. But, since this is a form of reductionism, there must be "bridge laws" connecting the (A)-things and the (B)-things. But what are they? They play a crucial methodological role. I come back to this below.

Catarina formulates the view like this:
I am largely in agreement with Searle both on what the ultimate goals of philosophy of language should be, and on the failure of much (though not all!) of the work currently done with formal methods to achieve this goal. Firstly, I agree that “any account of the philosophy of language ought to stick as closely as possible to the psychology of actual human speakers and hearers”. Language should not be seen as a freestanding entity, as a collection of structures to be investigated with no connection to the most basic fact about human languages, namely that they are used by humans, and an absolutely crucial component of human life. (I take this to be a general Wittgensteinian point, but one which can be endorsed even if one does not feel inclined to buy the whole Wittgenstein package.)
In short, I think this is a deep (but very constructive!) disagreement about ontology: what a language is.

On the Lewisian view, a language is, roughly, "a bunch of syntax and meaning functions"; and, in that sense, it is indeed a "free-standing entity".

(Analogously, the Lie group $SU(3)$ is a free-standing entity and can be studied independently of its connection to quantum particles called gluons (gluons are the "colour gauge field" of an $SU(3)$-gauge theory, which explains how quarks interact together). So, e.g., one can study Latin despite there being no speakers of the language; one can study infinitary languages, despite their having no speakers. One can study strings (e.g., proofs) of length $>2^{1000}$ despite their having no physical tokens. The contingent existence of one, or fewer, or more, speakers of a language $L$ has no bearing at all on the properties of $L$. Similarly, the contingent existence or non-existence of a set of physical objects of cardinality $2^{1000}$ has no bearing on the properties of $2^{1000}$. It makes no difference to the ontological status of numbers.)

Catarina continues by noting the usual way that workers in the (A)-field generally keep (A)-issues separate from (B)-issues:
I also agree that much of what is done under the banner of ‘formal semantics’ does not satisfy the requirement of sticking as closely as possible to the psychology of actual human speakers and hearers. In my four years working at the Institute for Logic, Language and Computation (ILLC) in Amsterdam, I’ve attended (and even chaired!) countless talks where speakers presented a sophisticated formal machinery to account for a particular feature of a given language, but the machinery was not intended in any way to be a description of the psychological phenomena underlying the relevant linguistic phenomena.
I agree - this is because when such a language $L$ is described, it is being considered as a free-standing entity, and so is not intended to be a "description". Catarina continues then:
It became one of my standard questions at such talks: “Do you intend your formal model to correspond to actual cognitive processes in language users?” More often than not, the answer was simply “No”, often accompanied by a puzzled look that basically meant “Why would I even want that?”. My general response to this kind of research is very much along the lines of what Searle says.
I think that the person working in the (A)-field sees that (A)-work and (B)-work are separate, and may not have any good idea about how they might even be related. Finally, Catarina turns to a positive note:
However, there is much work currently being done, broadly within the formal semantics tradition, that does not display this lack of connection with the ‘psychological reality’ of language users. Some of the people I could mention here are (full disclosure: these are all colleagues or former colleagues!) Petra Hendriks, Jakub Szymanik, Katrin Schulz, and surely many others. (Further pointers in comments are welcome.) In particular, many of these researchers combine formal methods with empirical methods, for example conducting experiments of different kinds to test the predictions of their theories. 
In this body of research, formalisms are used to formulate theories in a precise way, leading to the design of new experiments and the interpretation of results. Formal models are thus producing new insights into the nature of language use (pace Searle), which are then put to test empirically. 
The methodological issue comes alive precisely at this point.
How are (A)-issues related to (B)-issues? 
The logical point I argued for above was that a semantic theory $ST$ for a fixed well-defined language $L$ makes no empirical predictions, since the theory $ST$ is consistent with any empirical statement $\phi$. I.e., if $\phi$ is consistent, then $ST + \phi$ is consistent.

5. Cognizing a Language

On the other hand, there is a different empirical claim:
(C) a speaker $S$ speaks/cognizes $L$. 
This is not a claim about $L$ per se. It is cognizing claim about how the speaker $S$ and $L$ are related. This is something I gave some talks about before, and also wrote about a few times before here (e.g., "Cognizing a Language"), and also wrote about in a paper, "There's Glory for You!" (actually a dialogue, based on a different Lewis - Lewis Carroll) that appeared earlier this year. A cognizing claim like (C) might yield a prediction. Such a claim uses the predicate "$x$ speaks/cognizes $y$", which links together the agent and the language. But without this, there are no predictions.

The methodological point is then this: any such prediction from (C) can only be obtained by bridge laws, invoking this predicate linking the agent and language. But these bridge laws have not been stated at all. Such a bridge law might take the generic form:
Psycho-Semantic Bridge Law
If $S$ speaks $L$ and $L$ has property P, then $S$ will display (verbal) behaviour B.
Typically, such psycho-semantic laws are left implicit. But, in the end, to understand how the (A)-issues are connected to the (B)-issues, such putative laws need to be made explicit. Methodologically, then, I say that all of the interest lies in the bridge laws.

6. Summary

So, that's it. I summarize the three main points:
1. Against Searle and with Lewis: languages are free-standing entities, with their own properties, and these properties aren't dependent on whether there are, or aren't, speakers of the language.
2. The semantic description of a language $L$ is empirically neutral (indeed, the properties of a language are in some sense modally intrinsic).
3. To connect together the properties of a language $L$ and the psychological states or verbal behaviour of an agent $S$ who "speaks/cognizes" $L$, one must introduce bridge laws. Usually they are assumed implicitly, but from the point of view of methodology, they need to be stated clearly.
7. Update: Addendum 

I hadn't totally forgotten -- I sort of semi-forgot. But Catarina wrote about these topics before in several M-Phi posts, so I should include them too:
Logic and the External Target Phenomena (2 May 2011)
van Benthem and System Imprisonment (5 Sept 2011)
Book draft: Formal Languages in Logic (19 Sept 2011) 
(Probably some more, that I actually did forget...) And these raise many questions related to the methodological one here.

Tuesday, 24 June 2014

Sean Carroll: "Physicists should stop saying silly things about philosophy"

Readers probably saw this already, but I mention it anyhow. Physicist Sean Carroll has a 23 June 2014 post, "Physicists should stop saying silly things about philosophy", on his blog gently criticizing some recent anti-philosophy remarks by some well-known physicists, and trying to emphasize some of the ways physicists and philosophers of physics might interact constructively on foundational/conceptual issues. Interesting comments underneath too.

Saturday, 21 June 2014

Trends in Logic XIV, rough schedule

We now have a rough version of the conference schedule, including all the speakers and their titles. Here.


Friday, 20 June 2014

Preferential logics, supraclassicality, and human reasoning

(Cross-posted at NewAPPS)

Some time ago, I wrote a blog post defending the idea that a particular family of non-monotonic logics, called preferential logics, offered the resources to explain a number of empirical findings about human reasoning, as experimentally established. (To be clear: I am here adopting a purely descriptive perspective and leaving thorny normative questions aside. Naturally, formal models of rationality also typically include normative claims about human cognition.)  

In particular, I claimed that preferential logics could explain what is known as the modus ponens-modus tollens asymmetry, i.e. the fact that in experiments, participants will readily reason following the modus ponens principle, but tend to ‘fail’ quite miserably with modus tollens reasoning – even though these are equivalent according to classical as well as many non-classical logics. I also defended (e.g. at a number of talks, including one at the Munich Center for Mathematical Philosophy which is immortalized in video here and here) that preferential logics could be applied to another well-known, robust psychological phenomenon, namely what is known as belief bias. Belief bias is the tendency that human reasoners seem to have to let the believability of a conclusion guide both their evaluation and production of arguments, rather than the validity of the argument as such.

Well, I am now officially taking most of it back (and mostly thanks to working on these issues with my student Herman Veluwenkamp).

Already at the Q&A of my talk at the MCMP, it became obvious that preferential logics would not work, at least not in a straightforward way, to explain the modus ponens-modus tollens asymmetry (in other words: Hannes Leitgeb tore this claim to pieces at Q&A, which luckily for me is not included in the video!). As it turns out, it is not even obvious how to conceptualize modus ponens and modus tollens in preferential logics, but in any case a big red flag is the fact that preferential logics are supraclassical, i.e. they validate all inferences validated by classical logic, and a few more (i.e. there are arguments that are valid according to preferential logics but not according to classical logic, but not the other way round). And so, since classical logic sanctions modus tollens, then preferential logics will sanction at least something that looks very much like modus tollens. (But contraposition still fails.)

In fact, I later discovered that this is only the tip of the iceberg: the supraclassicality of preferential logics (and other non-monotonic systems) becomes a real obstacle when it comes to explaining a very large and significant portion of experimental results on human reasoning. In effect, we can distinguish two main tendencies in these results:
  •       Overgeneration: participants endorse or produce arguments that are not valid according to classical logic.
  •       Undergeneration: participants fail to endorse or produce arguments that are valid according to classical logic.

For example, participants tend to endorse arguments that are not valid according to classical logic, but which have a highly believable conclusion (overgeneration). But they also tend to reject arguments that are valid according to classical logic, but which have a highly unbelievable conclusion (undergeneration). (Another example of undergeneration would be the tendency to ‘fail’ modus tollens-like arguments.) And yet, overgeneration and undergeneration related to (un)believability of the conclusion are arguably two phenomena stemming from the same source, so to speak: our tendency towards what I call ‘doxastic conservativeness’, or less pedantically, our aversion to changing our minds and revising our beliefs.

Now, if we want to explain both undergeneration and overgeneration within one and the same formal system, we seem to have a real problem with the logics available in the market. Logics that are strictly subclassical, i.e. which do not sanction some classically valid arguments but also do not sanction anything classically invalid (such as intuitionistic or relevant logics), will be unable to account for overgeneration. Logics that are strictly supraclassical, i.e. which sanction everything that classical logic sanctions and some more (such as preferential logics), will be unable to account for undergeneration. (To be fair, preferential logics do work quite well to account for overgeneration.)

So it seems that something quite radically different would be required, a system which both undergenerates and undergenerates with respect to classical logic. At this point, my best bet (and here, thanks again to my student Herman) are some specific versions of belief revision theory, more specifically what is known as non-prioritized belief revision. The idea is that incoming new information does not automatically get added to one’s belief set; it may be rejected if it conflicts too much with prior beliefs (whereas the original AGM belief revision theory includes the postulate of Success, i.e. new information is always accepted). This is a powerful insight, and in my opinion precisely what goes on in the cases of belief bias-induced undergeneration: participants in fact do not really take the false premises as if they were true, which then leads them to reject the counterintuitive conclusions that do follow deductively from the premises offered. (See also this paper of mine which discusses the cognitive challenges with accepting premises ‘at face value’ for the purposes of reasoning.)

In other words, what needs to be conceptualized when discussing human reasoning is not only how reasoners infer conclusions from prior belief, but also how reasoners accept new beliefs and revise (or not!) their prior beliefs. Now, the issue seems to be that logics, as they are typically understood (and not only classical logic), do not have the resources to conceptualize this crucial aspect of reasoning processes – a point already made almost 30 years ago by Gilbert Harman in Change in View. And thus (much as it pains me to say so, being a logically-trained person and all), it does look like we are better off adopting alternative general frameworks to analyze human reasoning and cognition, namely frameworks that are able to problematize what happens when new information arrives. (Belief revision is a possible candidate, as is Bayesian probabilistic theory.)

Tuesday, 17 June 2014

Diff(M) vs Sym(|M|) in General Relativity

In General Relativity, whole "physical universes" are represented by spacetime models, which have the following form,
$\mathcal{M} = (M, g, T, \phi^{(i)})$
Here $M$ is some differentiable manifold, $g$ and $T$ are $(0,2)$ symmetric tensors, and the $\phi^{(i)}$ are various scalar, spinor, tensor, etc., fields representing matter, electrons, photons, and so on. The laws of physics require that the "metric" tensor $g$ and the "energy-momentum" tensor $T$ be related by a differential equation called "Einstein's field equations". The details are not important here though. (For the metric tensor, some authors write $g_{ab}$, many older works write $g_{\mu \nu}$ and some just $g$. Nothing hinges on this; just clarity.)

Suppose we consider a fixed spacetime model $\mathcal{M} = (M, g, T, \phi^{(i)})$. This is to represent some whole physical universe, or world, let us call it $w$. Let $|M|$ be the set of points in $M$. (We can call it the "carrier set".)

It is known that one may apply certain mathematical operations/transformations to the model $\mathcal{M}$ and also it is part of our understanding of General Relativity that the result is an "equivalent representation" of the same physical universe. This is all intimately related to what has come to be called "the Hole argument".

The mathematical operations are certain bijections $\pi : |M| \to |M|$ of the set of points in $M$ to itself. If $\mathcal{M}$ is our starting model, then the result is denoted $\pi_{\ast}\mathcal{M}$.

[To define $\pi_{\ast}\mathcal{M}$, the whole model is "pushforward" under $\pi$; we really just take the obvious image of every tensorial field $g, T, \dots$ under the map $\pi$: in geometry there are "pushforwards" and "pullbacks", and one has to be careful about contravariant and covariant geometric fields; but when we are dealing with mappings that are bijections, it doesn't matter.]

Which of these maps $\pi$s are allowed? That is,
for which maps $\pi: |M| \to |M|$, do $\mathcal{M}$ and $\pi_{\ast}\mathcal{M}$ represent the same $w$?
It is sometimes claimed that the relevant group of transformations, for General Relativity, is $\mathsf{Diff}(M)$. This is the set of bijections of $|M|$ to itself which leave the differential structure of $M$ invariant. I.e., the automorphisms of $M$. Since $M$ is a differentiable manifold, they are diffeomorphisms. Let me call this,
Weak Leibniz Equivalence:
if $\pi \in \mathsf{Diff}(M)$, then $\mathcal{M}$ and $\pi_{\ast}\mathcal{M}$ represent the same world.
But I say that the relevant group of transformations is much bigger, and is $\mathsf{Sym}(|M|)$, the symmetric group on $|M|$. That is, the relevant group is the group of all bijections of $|M|$ to itself:
Leibniz Equivalence:
if $\pi \in \mathsf{Sym}(|M|)$, then $\mathcal{M}$ and $\pi_{\ast}\mathcal{M}$ represent the same world.
This is the main point made in the Leibniz equivalence paper linked here. I sometimes give this as a talk, with usually some physicists, philosophers of physics and mathematicians there. At the moment, I get 50% I'm wrong and 50% I'm right.

There's a much more general formulation, which is very simple (and is essentially the content given in R.M. Wald's classic textbook on GR, p. 438), and which implies the above, and it's this:
Leibniz Equivalence:
If $\mathcal{M}_1$ and $\mathcal{M}_2$ are isomorphic spacetime models, then they represent the same physical world.
The mistake that people keep making, I say, is that they claim that the points of the manifold must be permuted smoothly. This, I claim, is not so. The points in $|M|$ can be permuted anyway one likes, so long as one applies the operation to everything - topology and differential structure included!

Sometimes this is called "gauge equivalence". Personally I don't care one way or the other about the terminology. However, note that Leibniz equivalence is analogous to the standard case of gauge equivalence - the U(1)-gauge symmetry that characterizes electromagnetism. Let $\mathbb{M}^4$ be Minkowski space, and let $A$ be the 1-form electromagnetic potential. Let $\Lambda$ be a smooth scalar field on $\mathbb{M}^4$. Let $d\Lambda$ be its derivative. Then the gauge equivalence principle for electromagnetism is that $A$ and $A + d \Lambda$ are "physically equivalent". I.e.,
$(\mathbb{M}^4, A)$ and $(\mathbb{M}^4, A + d \Lambda)$ represent the same physical world.
[I'm not really very knowledgeable of the philosophy of physics, and the various revisions and so on proposed, for example, against standard quantum theory, etc.: things like Bohmian mechanics, the GRW theory and so on. Here I'm just writing about classical General Relativity.]

[UPDATE (19 June 2014): I changed the text a teeny bit and added some links to the background maths.]

Saturday, 14 June 2014

Relativization of quantifiers and relativizing existence

One can relativize a claim by inserting a qualifying predicate for each quantifier. For example,
(1) For any number $n$, there is a number $p$ larger than $n$.
(2) For any prime number $n$, there is a prime number $p$ larger than $n$.
This is called relativization of quantifiers. Whereas (1) is kind of obvious, (2) is not. Formally, (1) is called a $\Pi_2$-sentence, as it has the form, roughly,
(3) $\forall x \exists y \phi(x,y)$
Suppose (1) is true. When we relativize to the subdomain of prime numbers, it expresses a different proposition, and we can consider whether it remains true in the subdomain which is the extension of the relativizing predicate. I.e.,
(4) $\forall x (P(x) \to \exists y (P(y) \wedge \phi(x,y))$.
ln fact (4) is true, but it expresses something stronger than (3) does. We might write the relativized (4) more perspicuously as,
$(\forall x \in P)(\exists y \in P) \phi(x,y)$.
$(\forall x : P)(\exists y : P) \phi(x,y)$.
$(\forall x)_P(\exists y)_P \phi(x,y)$.
Nothing hinges much on this: it is pretty clear what is meant either way.

Suppose we relativize to a finite set. Let $D(x)$ mean "$x$ is either 0, 1, 2 or 3". Then
(5) $(\forall x \in D)(\exists y \in D) \phi(x,y)$
is now false.

If $\Theta$ is the original claim, then we sometime denote the claim relativized to $P$ as $\Theta^P$. The fact that $\Theta$ is true does not in general imply that $\Theta^P$ is true. In general, if $\Theta$ is a true $\Pi_1$-sentence, then its relativization $\Theta^P$ is true as well. (In model-theoretic lingo, we say that "$\Pi_1$-sentences are preserved in substructures".) On the other hand, if $\Theta$ is a true $\Pi_2$-sentence, then its relativization need not be true, as we saw above.

Here is a vivid example. Imagine a society which contains Yoko, who happens not to be married to herself, and in which the following $\Pi_2$-sentence is true:
(6) Everyone is married to someone.
Now restrict this claim to the unit set, $\{Yoko\}$. Clearly,
(7) Everyone who is Yoko is married to someone who is Yoko,
is false.

This tells us a bit about how to relativize the quantifiers in a sentence to a predicate.

It may be annoying to keep relativizing univocal quantifiers, and one might prefer a many-sorted notation, in which distinct styles of variables are used to range over separate "sorts". So, for example, in textbooks and articles, we generally know that
the letter "$n$" (and probably "$m$") is going to denote a natural number.
the letter "$r$" (and probably "$s$") is going to denote a real number.
the letter "$z$" is likely to denote a complex number.
the letter "$t$" is likely to denote a time instant.
the letter "$f$" is likely to denote a function.
the Greek letter "$\phi$" is likely to denote either a mapping or a formula.
the Greek letter "$\omega$" is likely to denote either the set of finite ordinals or an angular frequency.
the upper-case Latin letter "$G$" is likely to denote either a graph or a group, and "$g$" will denote an element of the graph or group.
With capital Latin letters, "$A$", "$B$", "$C$", $\dots$, all bets are off! But "$X$" or "$Y$" are likely to denote sets. So, if you see, e.g., the equation,
(8) $f(t) = r$
then intuitively, the intention is that the value of the function $f$ at time $t$ is some real $r$.

While these issues seem fairly clear, can sense be made of relativizing existence itself? That is, can we make sense of a claim like:
(9) $x$ and $y$ "exist in different senses"

For example,
(10) The Eiffel Tower and $\aleph_0$ exist in different senses.
(11) Dame Kelly Holmes and Sherlock Holmes exist in different senses.
We usually think such claims are meaningful -- surely they are. But what exactly do they mean? Probably, something like this,
(12) $x$ and $y$ are (from or members of) different kinds of things.
And this seems to mean,
(13) there are kinds (types, ontological categories, ...) $A,B$ such that $\square[A \cap B = \varnothing]$, and $x \in A$ and $y \in B$.
There are two necessarily disjoint categories and $x$ is in one, and $y$ is in the other.

Quine wrote a famous paper, "On what there is" (1948). Normally, following Quine, we treat "what there is" and "what exists" as synonyms. But it is not very interesting to inquire as to what "exists", if one insists that "exists" be a predicate. If one insists that "exists" be a predicate, then what then becomes interesting is what this predicate "$x$ exists" means. Everyone agrees that ordinary usage counts as grammatical both:
(14) There exists a lion in the zoo.
(15) Sherlock does not exist.
The first is normally, and uncontroversially, formalized using the quantifier "$\exists$" and the second seems, on its surface, to involve a predicate.

[I have a mini-theory of what "$a$ exists" means. I think a claim of the form "$a$ exists" means "$\exists x H_a(x)$", where $H_a$ is, loosely speaking, the property of being $a$.]

Quine stressed that the meaning of the symbol "$\exists$" is explained as follows:
(16) $\exists x \phi$ is true if and only if there is some $o$ such that $\phi$ is true of $o$.
In other words, we explain the meaning of "$\exists$" using "there is". I can't quite see how it might work otherwise, except: by a proof-theoretic "implicit definition", via introduction and elimination rules.

Consider the following idea: the idea that the following two claims
(17) $\exists x \phi$ is true
(18) there is nothing that is $\phi$ 
are compatible.

One finds something like this being advocated as a solution to some problems in the foundations of mathematics. I think - but I am not sure - that Jody Azzouni's view is that (17) is compatible with (18). This would imply that there being no numbers (say) is compatible with the truth of mathematics. I cannot make good sense of this, mainly because the technical symbol "$\exists$'' is introduced precisely so that (17) and (18) are incompatible. Similarly, claim like,
(19) The sentence "There are numbers" is ontologically committed to there being numbers
is simply analytic, since it is part of the definition of the phrase "ontological commitment".

Suppose someone says there are things that don't exist (e.g., fictional objects or perhaps mathematical ones). I assume that, in their idiolect, "exists" means "has some property", but what this is has been left unspecified. If so, it means
(20) There are things which lack property $\dots$.
And what this $\dots$ is, is somehow left unspecified. A crucial ambiguity can arise. For example, the claim
(21) Numbers don't exist.
can be taken to mean,
(22) If there are numbers, they don't "exist"
(23) There are no numbers.
With a charitable interpretation, the first claim (22) is true, but not very interesting, because "exists" probably just means (in the speaker's idiolect) "is a concrete thing". No one in the world asserts that numbers are concrete things! The second claim, (23), is exciting: it denies that there are numbers.

Returning to relativized existence claims, like a claim of the form
(10) The Eiffel Tower and $\aleph_0$ exist in different senses,
I don't really see how making sense of such a claim requires anything other than working with many-sorted logic, where the sorts are thought of as having some deep metaphysical significance. For example, the assumed significance might involve a Platonic theory of Being vs. Becoming, and then we might take (10) to be based on an assumption like
(24) The Eiffel Tower belongs to the world of Becoming, while $\aleph_0$ belongs to the world of Being.
One would need to be careful about trying to make this kind of approach work with a 1-sorted logic, for example using a pair of quantifiers $\exists_1$ and $\exists_2$, as a famous argument shows that an assertion of existence-in-sense 1 is logically equivalent to an assertion of existence-in-sense 2:
$\vdash \exists_1 x \phi(x) \leftrightarrow \exists_2 x \phi(x)$.
Proof. Suppose $\exists_1 x \phi(x)$. Skolemize, to give $\phi(t)$, where $t$ is a skolem constant. By Existential generalization, $\exists_2 x \phi(x)$. So, $\exists_1 x \phi(x) \to \exists_2 x \phi(x)$. Similarly in the other direction.

I believe that Kurt Gödel says somewhere that no sense can be made of relativizing existence itself, and Quine also makes a similar point in various writings.

Friday, 13 June 2014

Metaphysics as Über-theory and Metaphysics as Meta-theory, II

Though it is common for logicians to be a bit negative about metaphysics, I am very fond of metaphysics. I can trace the reason: I purchased a scruffy copy of W.V. Quine's From a Logical Point of View (2nd ed., 1961) from a second-hand shop in Hay-on-Wye, around 1987, containing Quine's essays -- "On what there is" and other essays on related themes, such as modality, reference, opacity, etc. I found "On what there is" so engrossing that I numbered each paragraph and learnt it by heart. A few years ago, I lent this copy to a close friend, but it was never returned (aleha hashalom).

In an older post, and to some extent tongue-in-cheek responding to some criticisms of analytic metaphysics, I listed a number of achievements in analytic metaphysics. Analytic metaphysics is so closely related to mathematics that one might simply confuse the two, but this is an error. There is massive overlap between analytic metaphysics and mathematics. This is why some responded to the list of achievements of metaphysics by saying "is this not just mathematics?". Well, they overlap, and when X and Y overlap, then saying (truly) something is X, one does not establish that it is not Y. Sometimes, the criticism of "analytic" metaphysics, as opposed to "naturalized" metaphysics, is ad hominem, directed not so much of analytic metaphysics, but rather of analytic metaphysicians; they are theorists and not experimentalists, and they are bad theorists, because their knowledge of (empirical) science does not go beyond "A Level Chemistry". The important criticisms I see are: an epistemological criticism (how might knowledge of the relevant kind even be possible, entirely by a priori "armchair" reasoning?); a competence criticism ("A-Level chemistry"); an irrelevance criticism ("what a waste of time"). I don't know the answer to the first, but then no one knows how mathematical knowledge is possible, and yet mathematical knowledge exists.

It's fair to say that there is more weight in the "competence cricitism" of some modern metaphysicians, as one might call it. By and large, David Lewis tends to have a very classical picture of the (actual!!) world, with "lumps of stuff" at spacetime points (and regions), and perhaps the criticisms made against this is fair. However, one must be careful about stones and glass houses. There is some physics in Every Thing Must Go but not much: for example, no detailed computation of the electronic orbitals of a hydrogen atom using separation of variables in the Schroedinger equation, or of the Schwarzschild metric in GR, or of the properties of gases, or calculations of Clebsch-Gordan coefficients, etc. And what there is there includes a mistaken formulation of Ehrenfest's Theorem, as explained here: in the book, the equation given (twice) has the quantum inner product brackets (viz., expressions of the form $\langle \psi \mid \hat{O} \mid \psi \rangle$) misplaced in the equation.

But the basic point here is still unfair to those criticized by the "competence criticism", even if there's some legitimacy to the criticism. It is extremely difficult for someone whose specialization is metaphysics - but has not studied, say, theoretical physics or mathematics to graduate level - to acquire a detailed understanding of what can, and cannot, be said fruitfully about, say, the (alleged) implications of QM or GR. As an example of this, there's an (unpublished) article on Leibniz equivalence, and I gave it as a talk perhaps six times now, with physicists and philosophers of physics; audience response is this; 50% say it's obviously wrong and 50% say it's obviously right.

Since Frege, Russell, Wittgenstein, Carnap et al. were not experimentalists, it must be that whatever progress they made, if any at all, they must have made as theorists, and yes, in their armchair (or deckchair, for Wittgenstein). It is hard to see how an experiment might help me understand, for example, the semantics of sentences about fictional objects or possible worlds or transfinite cardinals. I may analyse the semantic content of, say, "Scott is the author of Waverley" or "I buttered the toast with a knife"; or I may try to analyse the Dirac equation. Or I may analyse, "You are almost as interesting as Sherlock Holmes is", in which a real-life person is compared with a fictional character. Consequently, "analytic" refers to a method, not to any specific content. The content is unconstrained: it may be possible words, fictional objects, moral values, topological field theories, transfinite sets, the unit of selection debate, etc., etc.

In an older M-Phi post, Metaphysics as Über-theory and Metaphysics as Meta-theory I suggested one could think of metaphysics in two ways:
  • Metaphysics as über-theory (think - Plato).
  • Metaphysics as meta-theory (think - Aristotle).
One could probably run through any piece of work classifiable as "metaphysics" and identify which bits are über-theoretic and which bits are meta-theoretic. It is Pythagorean über-theory that "all things are numbers" and it is Aristotelian meta-theory to say that "to say of what is, that it is, is true". So, as I want to use this word, in über-theory, one attempts an overall picture of "how things ... hang together". That is,
"The aim of philosophy, abstractly formulated, is to understand how things in the broadest possible sense of the term hang together in the broadest possible sense of the term” (Sellars, 1962, "Philosophy and the Scientific Image of Man")
Sellars says this of philosophy in general, but I think it is an overestimate. Philosophers should be able to work on small problems without feeling the intellectual burden, a somewhat pretentious one too, of trying to understand how "things hang together". For example, I don't think Russell's "On Denoting" (1905) fits this picture at all, but I do think his Principles of Mathematics (1903) or "Philosophy of Logical Atomism" (1918-19) do.

For example, when Prof. Max Tegmark suggests that physics is ultimately mathematics, then that claim is an example of über-theory. I think this somehow neglects the modal contingency of concrete entities, that "how the concreta are" changes from world to word; and that rather it is mathematics that is ultimately physics -- the physics of modally invariant objects -- then that's über-theory too. If you like, mathematics is the physics of modality. The idea is that purely abstract entities, like $\pi$, $\aleph_0$ and $SU(3)$, don't modally change their relationships to one another as we let the worlds vary. There is no world "in" which $3 < 2$, $(\omega, <)$ is not wellordered, or $e^{i \pi} + 1 \neq 0$. Purely abstract objects are not even "in" possible worlds at all. The distinction between physics and mathematics is not, I think, connected to how knowledge of their objects is acquired, but is connected to the fact that relations amongst purely abstract entities (e.g., how $\omega$ is related to any $n \in \omega$) are fixed and invariant ("Being"in Plato's terminology), whereas the relations of concreta, such as e.g., Blackpool Tower and the Eiffel Tower, are a matter of change ("Becoming", in Plato's terminology). For example, at the moment, the Eiffel Tower (a concretum) is higher than the Blackpool Tower (a concretum), but this temporary French advantage over the British could, of course, be remedied by an "accident" (Team America; World Police, 4:15).

Russell's Principles of Mathematics contains a great deal of über-theory and meta-theory, but his "On Denoting" is a classic of meta-theory. Meta-theory was the central focus of Rudolf Carnap's Der logische Aufbau (1928). The logical apparatus for doing meta-theory had blossomed with the publication of Frege's Begriffsschrift (1879) and then was amplified in his later writings on semantics and applied to the case of the foundations of arithmetic in Die Grundlagen der Arithmetik (1884). Bertrand Russell joined this revolution against German idealism in 1899, after attending a conference at which Peano was present. Russell's friend, G.E. Moore, was part of this rebellion too, although not a logician. A decade later, on the advice of Frege, a young Austrian, Ludwig Wittgenstein, spent a year and half visiting Russell at Trinity College. In general, and in practice, all published work in metaphysics does both. Meta-theoretic work in metaphysics has no serious objection to it, aside perhaps from mild "competence" accusations of insufficient expertise in difficult parts of, let's say, mathematical logic or theoretical physics. This work appears alongside the work of other logicians, mathematics and computer scientists (and sometimes cognitive scientists), often in the same journals. Aside from the usual internecine waffle and squabbles -- e.g., about one's favourite "logic", etc. -- there is no deep disagreement as to methods and also as to the genuine progress that is made. For example, Michael Clark once showed me, on a blackboard in 2000, a paradox, involving an infinite list of sentences, and I was struck by how one might make it precise. When I went home and did that, I discovered that the infinite set (simplifying notation quite bit),
$\{Y(n) \leftrightarrow \forall x>n \neg T(Y(x)) \mid n \in \mathbb{N} \} \cup \{T(\phi) \leftrightarrow \phi \mid \phi \in L\}$ 
actually had a model: a non-standard model. That's progress and I did no "experiment".

With über-theory, it is different. For how can a priori reflection, from the armchair, tell us "how everything hangs together". Surely, that task is for the empirical scientist, and, in the end, for the physicist. In other words, it seems utterly pretentious for a metaphysician to even insinuate some ability to discover "how things hang together". I agree. Well, sort of. There are two main lines of response. The first is that while it is true that the armchair metaphysician is not performing experiments on neutrinos or gravitational waves (and neither of course is the mathematician or theoretical scientist), the armchair metaphysician is going to have, and should be expected to have, some degree of knowledge and acquaintance with science - with mathematics, with formal parts of linguistics and computer science, with parts of physics, chemistry, biology and psychology (cognitive science, more broadly). But this is material for analysis. It is not therefore a direct attempt to find out how "everything hangs together", but an attempt to see how our best scientific theories (or even how our discourse in general) depict "how things hang together". A second response focuses on what these "things" might be in "how things hang together"? The scientist may be interested in galaxies or ganglia; for the metaphysician, there also is a more or less canonical list of the kinds of things one are interested in: properties, relations, quantities, abstract entities and structures, formal systems, moral values, propositionspieces of discourse, possible worlds and fictional entities.

Wednesday, 28 May 2014

How inaccurate is your total doxastic state?

I've written a lot on this blog about ways in which we might measure the inaccuracy of an agent when she has precise numerical credences in propositions.  I've tried to describe the various ways in which philosophers have tried to use such measures to help argue for different principles of rationality that govern these credences.  For instance, Jim Joyce has argued that credences should satisfy the axioms of the probability calculus because any non-probabilistic credences are accuracy-dominated by probabilistic credences: that is, if $c$ is a non-probabilistic credence function, there is a probabilistic credence function $c^*$ such that $c^*$ is guaranteed to be more accurate than $c$.

Of course much of the epistemological literature is concerned with agents who have quite different sorts of doxastic attitudes.  It is concerned with agents who have not credences, which we might think of as partial beliefs, but rather agents who have full or all-or-nothing or categorical beliefs.  One might wonder whether we can also describe ways of measuring the inaccuracy of these doxastic attitudes.  It turns out that we can.  The principles of rationality that follow have been investigated by (amongst others) Hempel, Maher, Easwaran, and Fitelson.  I'll describe some of the inaccuracy measures below.

This raises a question.  Suppose you think that credences and full beliefs are both genuine doxastic attitudes, neither of which can be reduced to the other.  Then it is natural to think that the inaccuracy of one's total doxastic state is the sum of the inaccuracy of the credal part and the inaccuracy of the full belief part.  Now suppose that you think that, while neither sort of attitude can be reduced to the other, there is a tight connection between them for rational believers.  Indeed, you accept a normative version of the Lockean thesis: that is, you say that an agent should have a belief in $p$ iff her credence in $p$ is at least $t$ (for some threshold $0.5 < t \leq 1$) and she should have a disbelief in $p$ iff her credence in $p$ is at most $1-t$.  Then it turns out that something rather unfortunate happens.  Joyce's accuracy dominance argument for probabilism described above fails.  It now turns out that there are non-probabilistic credence functions with the following properties: while they are accuracy-dominated, the rational total doxastic state that they generate via the normative Lockean thesis -- that is, the total doxastic state that includes those credences together with the full beliefs or disbeliefs that the normative Lockean thesis demands -- is not accuracy-dominated by any other total doxastic state that satisfies the normative Lockean thesis.

Let's see how this happens.  We need three ingredients:

Inaccuracy for credences

The inaccuracy of a credence $x$ in proposition $X$ at world $w$ is given by the quadratic scoring rule:
i(x, w) = \left \{ \begin{array}{ll}
(1-x)^2 & \mbox{if $X$ is true at $w$} \\
x_k & \mbox{if $X$ is false at $w$}
Suppose $c = \{c_1, \ldots, c_n\}$ is a set of credences on a set of propositions $\mathbf{F} = \{X_1, \ldots, X_n\}$.  The inaccuracy of the whole credence function is given as follows:
I(c, w) = \sum_k i(c_k, w)

Inaccuracy for beliefs

Suppose $\mathbf{B} = \{b_1, \ldots, b_n\}$ is a set of beliefs and disbeliefs on a set of propositions $\mathbf{F} = \{X_1, \ldots, X_n\}$.  Thus, each $b_k$ is either a belief in $X_k$ (denoted $B(X_k)$), a disbelief in $X_k$ (denoted $D(X_k)$), or a suspension of judgment in $X_k$ (denoted $S(X_k)$).  Then we measure the inaccuracy of attitude $b$ in proposition $X$ at world $w$ is given as follows: there is a reward $R$ for a true belief or a false disbelief; there is a penalty $W$ for a false belief or a true disbelief; and suspensions receive neither penalty nor reward regardless of the truth of the proposition in question.  We assume $R, W > 0$.  Since we are interested in measuring inaccuracy rather than accuracy, the reward then makes a negative contribution to inaccuracy and the penalty makes a positive contribution. Thus:
i(B(X), w) = \left \{\begin{array}{ll}
-R & \mbox{if $X$ is true at $w$} \\
W & \mbox{if $X$ is false at $w$}
i(S(X), w) = \left \{\begin{array}{ll}
0 & \mbox{if $X$ is true at $w$} \\
0 & \mbox{if $X$ is false at $w$}
i(D(X), w) = \left \{ \begin{array}{ll}
W & \mbox{if $X$ is true at $w$} \\
-R & \mbox{if $X$ is false at $w$}
This then generates an inaccuracy measure on a set of beliefs $\mathbf{B}$ as follows:
I(\mathbf{B}, w) = \sum_k i(b_k, w)
Hempel noticed that, if $R = W$ and $p$ is a probability function, then: $B(X)$ uniquely minimises expected utility by the lights of $p$ iff $p(X) > 0.5$; $D(X)$ uniquely maximises expected utility by the lights of $p$ iff $p(X) < 0.5$; $S(X)$ maximises expected utility iff $p(X_k) = 0.5$, but in that situation, $B(X)$ and $D(X)$ do too.  Easwaran has investigated what happens if $R \neq W$.

Lockean thesis

For some $0.5 < t \leq 1$:
  • A rational agent has a belief in $X$ iff $c(X) \geq t$;
  • A rational agent has a disbelief in $X$ iff $c(X) \leq 1-t$;
  • A rational agent suspends judgment in $X$ iff $1-t < c(X) < t$.

Inaccuracy for total doxastic state

We can now put these three ingredients together to give an inaccuracy measure for a total doxastic state that satisfies the normative Lockean thesis.  We state the measure as a measure of the inaccuracy of a credence $x$ in proposition $X$ at world $w$, since any total doxastic state that satisfies the normative Lockean thesis is completely determined by the credal part.
i_t(x, w) = \left \{ \begin{array}{ll}
(1-x)^2 - R & \mbox{if } t \leq x \leq 1\mbox{ and } X \mbox{ is true} \\
(1-x)^2  & \mbox{if } 1- t < x < t\mbox{ and } X \mbox{ is true} \\
(1-x)^2 + W & \mbox{if } 0 \leq x \leq t\mbox{ and } X \mbox{ is true} \\
x^2 + W & \mbox{if } t \leq x \leq 1\mbox{ and } X \mbox{ is false} \\
x^2  & \mbox{if } 1- t < x < t \mbox{ and } X \mbox{ is false}\\
x^2 - R & \mbox{if } 0 \leq x \leq t \mbox{ and } X \mbox{ is false}\\
Finally, we give the total inaccuracy of such a doxastic state:
I_t(c, w) = \sum_k i_t(c_k, w)
Three things are interesting about this inaccuracy measure.  First, unlike the inaccuracy measures we usually deal with, it's discontinuous.  The inaccuracy of $x$ in $X$ is discontinuous at $t$ and at $1-t$.  If $X$ is true, this is because, as $x$ crosses the Lockean threshold $t$, it gives rise to a true belief, whose reward contributes negatively to the inaccuracy; and as it crosses the other Lockean threshold $1-t$, it gives rise to a true disbelief, whose penalty contributes positively to the inaccuracy.

Second, the measure is proper.  That is, each probabilistic set of credences expects itself to be amongst the least inaccurate.

Third, as mentioned above, there are non-probabilistic credence functions that are not accuracy-dominated when inaccuracy is measured by $I_t$.  Consider the following example. 
  • $\mathbf{F} = \{X, \neg X\}$.  That is, our agent has credences only in two propositions.
  • $c(X) = 0.6$ and $c(\neg X) = 0.5$.
  • $R = 0.4$, $W = 0.6$.  That is, the penalty for a false belief or true disbelief is fifty percent higher than the reward for a true belief.
  • $t = 0.6$.  That is, a rational agent has a belief in $X$ iff her credence is at least than 0.6; and she has a disbelief in $X$ iff her credence is at most 0.4.  It's worth noting that, for probabilistic agents who specify $R$ and $W$ as we just have, satisfying the Lockean thesis with $t = 0.6$ will always minimize expected inaccuracy.
Then we have the following result:  There is no total doxastic state that satisfies the Lockean thesis that $I_t$-dominates $c$.

The following figure helps us to see why.

Here, we plot the possible credence functions on $\mathbf{F} = \{X, \neg X\}$ on the unit square.  The dotted lines represent the Lockean thresholds: a belief threshold for $X$ and a disbelief threshold for $X$; and similarly for $\neg X$.  The undotted diagonal line include all the probabilistically coherent credence functions; that is, those for which the credence in $X$ and the credence in $\neg X$ sum to 1.  $c$ is the credence function described above.  It is probabilistically incoherent.  The lower right-hand arc includes all the possible credence functions that are exactly as inaccurate as $c$ when $X$ is true and inaccuracy is measured by $I$.  The upper left-hand arc includes all the possible credence functions that are exactly as inaccurate as $c$ when $\neg X$ is true and inaccuracy is measured by $I$.

Note that, in line with Joyce's accuracy-domination argument for probabilism, $c$ is $I$-dominated.  It is $I$-dominated by all of the credence functions that lie between the two arcs.  Some of these -- namely, those that also lie on the diagonal line -- are not themselves $I$-dominated.  This seems to rule out $c$ as irrational.  But of course, when we are considering not only the inaccuracy of $c$ but also the inaccuracy of the beliefs and disbeliefs to which $c$ gives rise in line with the Lockean thesis, our measure of inaccuracy is $I_t$, not $I$.  Notice that all the credence functions that $I$-dominate $c$ do not $I_t$-dominate it.  The reason is that every such credence function assigns $X$ a credence less than 0.6.  Thus, none of them give rise to a full belief in $X$.  As a result, the decrease in $I$ that is obtained by moving to one of these does not exceed $R$, which is the accuracy 'boost' obtained by having the true belief in $X$ to which $c$ gives rise.  By checking cases, we can see further that no other credence function $I_t$-dominates $c$.

Is this a problem?  That depends on whether one takes credences and beliefs to be two separate, but related doxastic states.  If one does, and if one accepts further that the Lockean thesis describes the way in which they are related, then $I_t$ seems the natural way to measure the total doxastic state that arises when both are present.  But then one loses the accuracy-domination argument for probabilism.  However, one might avoid this conclusion if one were to say that, really, there are only credence functions; and that beliefs, to the extent they exist at all, are reducible to credences.  That is, if one were to take the Lockean thesis to be a reductionist claim rather than a normative claim, it would seem natural to measure the inaccuracy of a credence function using $I$ instead of $I_t$.  While one would still say that, as a credence in $X$ moves across the Lockean threshold for belief, it gives rise to a new belief, it would no longer seem right to think that this discontinuous change in doxastic state should give rise to a discontinuous change in inaccuracy; for the new belief is not really a genuinely new doxastic state; it is rather a way of classifying the credal state.