They say that a little learning can be a dangerous thing. But surely that only applies to me if I am irrational? If I think I won’t change my beliefs in response to my learning in the way rationality requires, or I think I won’t act on the new beliefs I have in a rational way, I may well think it worse for me to acquire that learning than to remain in ignorance. If, for instance, a little learning about a political topic will make me much more confident than I should be in certain claims about that topic, I might anticipate that, after I gain that little learning, I’ll make poor decisions. But surely if I think I’ll respond rationally to the new learning, and then act upon my new beliefs in the way rationality requires, then I’ll think now that the learning is a good thing; something to be hoped for. Is that not so? Work on the value of information over the past forty years has given us reason to say ‘no’. In this post, I’d like to set out a puzzle that arises out of this. In future posts, I’ll explore my own favoured solution.

I’ll start with a brief presentation of how certain philosophers and economists have thought about the value of information (a much longer treatment of this can be found in Part I of my notes on this topic); then I’ll present the sorts of cases that have led these researchers to think that not all learning is desirable, even for perfectly rational individuals; from this, I’ll draw out a puzzle and show why one natural response doesn’t work.
The value of information
It is always difficult to identify the precise origin of an intellectual discovery, partly because it depends on what you think is the discovery’s core insight and what later developments were merely accretions deposited on it, but this recent paper by Christian Torsell of UC Irvine convinced me that, at least in published form, the origin of the crucial insight behind the value of information framework lies in a 1931 paper by Janina Hosiasson called ‘Why do we prefer probabilities relative to many data?’. I won’t dive into the history here, but rather present the version of this framework that has become accepted in the hundred years since Hosiasson’s paper.
We represent someone who might be about to learn something by giving their degrees of belief (or credences) in a set of possibilities; we might think of these possibilities as possible states of the world or what philosophers call just possible worlds. And we represent the learning experience they might undergo by what Nilanjan Das calls an evidence function: this takes each possibility the person considers, and specifies the strongest proposition they’d learn as evidence at that possibility as a result of that learning experience.
So, for instance, I might consider four possibilities about the temperature this afternoon in Bristol: Very Hot, Hot, Cold, Very Cold. And I might place credences over these: perhaps 10%, 20%, 30%, and 40%, respectively (it’s summer here, after all).
Now, let us suppose that I could learn whether or not the temperature will be extreme: that is, if it’s going to be very Hot, I’ll learn it’s going to be either very hot or very cold, and similarly if it’s going to be very cold; but if it’s going to be hot, I’ll learn it’s either going to be hot or cold, and similarly if it’s going to be cold.
Whatever I learn, I’ll update in the way the Bayesian says I should. If I learn it’s very hot or very cold, I’ll come to assign 0% credence to Hot and to Cold, and I’ll assign 10/(10+40) = 20% credence to Very Hot and 40/(10+40) = 80% credence to Very Cold. If, on the other hand, I learn it’s hot or cold, I’ll assign 0% credence in Very Hot and in Very Cold, and I’ll assign 20/(20+30) = 40% credence in Hot and 30/(20+30) = 60% credence in Cold.
The question is: should I hope to learn this evidence? Is that something that I now judge, from the point of view of my prior credences, to be a desirable thing?
The value of information theorems
There are two ways to approach this question: the first is pragmatic; the second purely epistemic. Let’s take them in turn.
The Pragmatic Value of Information Theorem
One thing that I’ll do with my credences, whether I learn the evidence and update or I don’t and stick with my priors, is to use them as guides to action. So let us suppose that, a little later than the time at which I might acquire the evidence, I’ll face a decision between a range of options—perhaps it’s the decision whether to take a sun hat or sun screen or both or neither when I leave the house for the day. And suppose further that I’ll pick one of those options that maximizes expected utility relative to the credences I have at the time, namely, my priors if I don’t learn, and whichever posterior I’ll have if I do learn. Then we can say that the value of learning the evidence at a particular state of the world is the utility I’d obtain at that state of the world from whichever option I’d choose using the credences I’d have at that state of the world, which are of course the posteriors obtained from my priors by updating on whatever evidence I’ll get at that state of the world. And we can say that the value of not learning the evidence at a particular state of the world is the utility I’d obtain at that state of the world from whichever option I’d choose using the credences I’d have at that state of the world, which are of course just the prior credences.
Now, having defined the value of each of the two options available—namely, learning the evidence and not learning it—we can calculate, from the point of view of my priors, what the expected utility for each is. And it turns out that, in the case we’ve described, whatever utilities I assign to the different options at the different states of the world, my prior will assign at least as great expected utility to learning the evidence as to not learning it; and, roughly speaking, providing there’s some state of the world such that the evidence I’d acquire there would change my mind about which option to pick, then my prior assigns strictly greater expected utility to learning the evidence than it assigns to not learning it. And so, at least providing that obtaining the evidence is cost-free, I’d be irrational not to take it. This is sometimes known as Good’s Value of Information Theorem, though Good himself acknowledges that it isn’t original to him, and as I said earlier, it’s really Hosiasson who has the original insight. I’ll call it the Pragmatic Value of Information Theorem.
The Epistemic Value of Information Theorem
Let’s now turn to the epistemic approach to whether or not we should hope to acquire the evidence. This is due to Graham Oddie. The idea is that, while our credences do indeed guide our actions, and while we might measure their pragmatic value as the utility they’ll acquire for us through the decisions they guide us to make, they also have purely epistemic utility. We then provide a measure of epistemic utility, which takes an assignment of credences and a state of the world and says how valuable those credences are from a purely epistemic point of view at that state of the world. If you’re a veritist, then you’ll say the epistemic utility of some credences is a measure of their accuracy; but we won’t assume that. We’ll assume only that the measure of epistemic utility is strictly proper: that is, each set of credences that satisfy the probability axioms assigns greatest expected epistemic utility to itself; that is, it assigns lower expected epistemic utility to any other set of credences than it assigns to itself.
Now, the epistemic utility of not learning the evidence is just the epistemic utility of the prior credences themselves, since those are the ones you’ll continue to have if you don’t learn the evidence. And the epistemic utility, at a state of the world, of learning the evidence is the epistemic utility, at that world, of the posterior credences that are obtained from your priors by updating them on whatever evidence you receive at that world. And now we can compare the expected epistemic utility of learning and not learning from the point of view of your prior credences. And, Oddie shows, in the case we’ve described, learning has at least as great expected epistemic utility as not learning, regardless of which strictly proper measure of epistemic utility you use; and again, roughly speaking, if the evidence will lead you to change your credences—as it almost certainly will—then learning has strictly greater expected epistemic utility than not learning. Let’s call this Oddie’s Epistemic Value of Information Theorem.
Now, together, you might be forgiven for thinking that the Pragmatic and Epistemic Value of Information Theorems vindicate the intuitive judgment I denied above, namely, that, providing you’re responding rationally to your evidence and acting rationally upon your credences, a little learning is never a bad thing, at least from your own point of view. But the problem is that the Pragmatic and Epistemic Value of Information Theorems make quite specific assumptions about the nature of the evidence you might get. The evidence functions involved must be both factive and partitional. That means first that, at each state of the world, the evidence you’d obtain there is true at that state of the world; and second, for any two states of the world, the evidence obtained at one is either exactly the same as the evidence obtained at the other, or it is fully inconsistent with the evidence obtained at the other, so that there is no way both can be true. We can see that this is satisfied in the case of the Bristol afternoon temperature above: the propositions learned at Very Hot or Very Cold are the same, the propositions learned at Hot or Cold are the same; and those propositions are inconsistent with one another.
So what happens if our evidence isn’t like this? Well, John Geanokoplos described weaker conditions that still guarantee the Value of Information theorems go through; and Kevin Dorst, et al. specify conditions in a different way. But if the evidence doesn’t satisfy even these weaker conditions, then there can be cases in which learning has lower expected pragmatic utility than not learning, when faced with a specific decision problem; and there can be cases in which learning has lower expected epistemic utility than not learning.
I won’t go into the general results (though again they’re summarised in these notes, and Dorst, et al.’s original paper is well worth the read). Instead, I want to look at a couple of examples to raise the problem that interests me, which is: what to do if you receive evidence that you’d have preferred not to receive?
Disciminating colours
Let’s suppose I’ve just been talking to you on the telephone and I’ve been telling you all about a scarf I’ve just bought. Despite my description, you’re still unsure of its exact colour. You know it’s either rose or peach or pink, and from what I’ve said, you’ve come to assign credences of 25%, 40%, and 35% to these three possibilities, respectively.
Later today, we meet for coffee and I ask if you’d like to see the scarf. Unfortunately, while your colour vision is good, the lighting in the coffee shop is poor. You know you’ll find it hard to differentiate rose from peach and peach from pink in these conditions; what’s more, you know the difficulties will be a particular way. If the scarf is actually either rose or peach, then your evidence will be that it’s either rose or peach, but nothing more specific—so you’ll be able to tell it’s not pink, but you won’t be able to tell which of rose or peach it is. But if it is actually pink, your evidence will be that it’s either peach or pink, and nothing more specific—so if it’s pink, you will be able to rule out that it’s rose, but nothing more than that.
Notice that this learning situation provides evidence that is factive, but not partitional: after all, Rose or Peach is not inconsistent with Peach or Pink, for both are true if the scarf is peach. So it doesn’t satisfy the conditions required for the original Value of Information theorems; and in fact it doesn’t satisfy Geanokoplos’ weaker conditions either.
Now consider what happens if you update on the evidence you might receive. If the scarf is rose or peach and you learn Rose or Peach, you’ll come to assign credence 5/13 to Rose, 8/13 to Peach, and 0 to Pink. And if the scarf is pink and you learn Peach or Pink, you’ll assign credence 0 to Rose, 8/15 to Peach, and 7/15 to Pink.
What are the expected pragmatic and epistemic utilities of acquiring this evidence, from the point of view of your prior, and how do they compare to the expected pragmatic and epistemic utilities of avoiding it from the same point of view?
Well, the expected pragmatic utility of course depends on the decision you’ll face with the credences you end up with. I’ll give an example of a decision for which the expected pragmatic utility of learning is greater than the expected pragmatic utility of not learning; and then I’ll give an example in which the reverse is the case.
First, suppose you must decide whether or not to pay £2 for a bet that will payout £7 if the scarf is pink and £0 otherwise. Then your priors will take the bet at that price. But your posteriors if the scarf is rose or peach will reject the bet and end up with £0 in that case, which is better than losing £2, which is what your priors will lead you to do; and your posteriors if the scarf is pink will take the bet, and win, gaining you £5, which is exactly as good as what your priors will lead you to get in this state of the world. And so, in expectation, learning is better than not learning since, in each possible state of the world, learning leads to credences that make choices that are, at that world, at least as good and sometimes better than the choices the priors would make.
However, now suppose you must decide whether to pay £7 for a bet that will pay out £15 if the scarf is peach and £0 otherwise. Then your prior, which assigns credence 8/20 = 6/15 to Peach, will consider this too expensive, while your posterior, which will be either 8/13 or 8/15, will consider this a good price. So your prior knows that your posterior will take the bet, an option that it considers worse than the option it would choose, namely, rejecting the bet. So the prior will take learning the evidence to have lower expected pragmatic utility than avoiding it.
And something similar is true in the case of epistemic utility. Suppose we use the log score, which says that the epistemic utility of a set of credences over some possibilities is the logarithm of the credence assigned to the true possibility:
Then your prior will assign higher expected epistemic utility to learning than to not learning.
However, suppose instead that we use the Brier score, which says that the epistemic utility of a set of credences is twice the credence in the true possibility minus the sum of the squares of the credences in each possibility:
Then your prior will assign higher expected epistemic utility to not learning than to learning.
A way out?
If you’re anything like me, when you first hear about this sort of case, you immediately think that the conclusion is an artefact of the way we’ve modelled the case—or, more precisely, the way we stopped modelling the case at some point. After all, in reality, if you know beforehand all this information about how you will and won’t be able to distinguish between the different colours, then after you’ve had the experience and updated your credences, surely you will be able to look at your posterior credences and glean from them some further information about your situation. For instance, if you look and see that you now assign credence 0 to Rose, then you can infer that the scarf must in fact be pink, because it’s only in that situation that you’re able to rule out Rose.
This is all well and good if your credences are accessible to you. But typically in this debate we allow that your credences are sometimes not accessible to you. And so you can’t get out of the pickle this easily.
The puzzle
The situations I’ve described above are interesting, but it’s not obvious there’s anything puzzling about them. After all, they just say that sometimes we should avoid gathering certain evidence because, from our current point of view, we judge that the credences we’ll end up adopting in response to it will serve us poorly, either as guides to action or in their purely epistemic capacity, whatever that is.
But I think there’s a bigger concern here. After all, while sometimes we choose what evidence to gather, sometimes we don’t. Sometimes, we simply receive evidence whether we like it or not. And, in such a case, where we would have preferred to avoid a particular learning situation but had it foisted upon us all the same, to which credences should we turn when we make our decisions or wish to represent the world? Our priors from before the unlooked-for learning experience? Or our posteriors that we’ve obtained by updating on the evidence we in fact received because of it? Perhaps, for instance, I unthinkingly take my scarf out of my bag in the coffee shop and pop it on the table. Now you’ve seen it, and have received the evidence you hoped to avoid. Which credences should you use? Your priors think you should use them; your posteriors think you should use them. What does rationality require?
The Principle of Total Evidence to the rescue?
A natural thought is that it is of course your posteriors you should use. After all, they are better informed than your priors; they incorporate evidence your priors lack. And indeed doesn’t the much-vaunted Principle of Total Evidence demand this of us? Doesn’t it say we should be guided by whichever credences incorporate the strongest evidence available to us?
I think there are two problems with this quick response. The first is that it was precisely the Principle of Total Evidence that Janina Hosiasson—and, later, I. J. Good—were attempting to establish using the Value of Information theorem. So it seems strange to appeal to that principle to guide us in cases in which that theorem fails to hold.
But the second is more serious. There are cases in which the Principle of Total Evidence is very much not the thing to do. A classic example is given by Tim Williamson’s lovely example of an unmarked clock. Let’s suppose you know that the hour has just struck, but you don’t have any idea which hour it is. You divide your credences equally between one, two, three, and so on up to twelve. And now suppose you’re offered the possibility of viewing an fancy minimalist clock. It has just one hand and no markings. As a result, it’s a bit difficult to tell the time from it. But you know this about your perceptual acuity: it has a margin of error of a single hour. So, if it’s two o’clock and you look at the clock, you’ll learn it’s either one, two, or three and nothing stronger; if it’s three o’clock, you’ll learn it’s two, three, or four and nothing stronger; and so on; if it’s twelve o’clock you’ll learn it’s eleven, twelve, or one and nothing stronger; and if it’s one o’clock, you’ll learn it’s twelve, one, or two and nothing stronger.
Now, suppose you now that, in a little while, you’ll face a choice: you can pay £6 for a bet that pays out £10 if the hour is an even number and £0 otherwise; you can pay £6 for a bet that pays out £10 if the hour is a odd number and £0 otherwise; or you can do nothing. Currently, with equal credence in each of the twelve possible hours, you’ll opt to do nothing: neither of the available bets at the available prices is a good prospect by your lights. But you know that, whatever you learn, you’ll take one of those bets: after all, you’ll either have credence 2/3 that the hour is even and 1/3 it’s odd, or vice versa. The problem is that, from your prior point of view, you also know that, whichever bet your posteriors guide you to take, you’ll lose it. For instance, if it’s two o’clock, you’ll learn it’s one, two, or three; so you’ll assign credence 2/3 that the hour is odd; that will lead you to pay £6 for the bet that pays out if it’s odd; and you’ll promptly lose your shirt, since it’s even. And the same goes for every other time it might be. So you’d prefer not to learn the evidence, since learning it will lead to a guaranteed loss of £6, whereas avoiding it will lead to no gain and no loss.
But now suppose you see the clock all the same—you thought it was in the room to your right and so walked left, but actually it was in the room to the left, let’s say. So you obtain the evidence you wished to avoid, you update your credences, and now, while you don’t know what your credences are, you know they’ll choose an option when faced with the decision I described that will lose them £6. In this case, following the Principle of Total Evidence is a road to ruin.
Now, this at least shows that we shouldn’t always follow the Principle of Total Evidence. But it does little to tell us when we should and when we shouldn’t.
So that’s the puzzle. In follow-up posts, I’ll try to figure out how to resolve it. It’s worth noting as I wrap up, though, that it isn’t an idle puzzle: while the cases I’ve described are a bit artificial, it’s quite likely that the question arises for huge swathes of evidence we acquire through perception, and possibly for other sorts of evidence as well. After all, many perceptual experiences have the sort of margin-for-error structure that Williamson’s clock has, but more generally the lack of access to our evidence and our posterior credences that both of our examples share, the scarf and the clock.
Like Quentin, I’m a little confused by the clock example (I also can’t access Williamson’s paper, so apologies if I’ve missed something obvious).
Suppose it’s 2 o’clock: when I update with the clock evidence, I know that it’s (1,2,3 o’clock), and I also know that [if it’s (1,2,3 o’clock) then it’s 2 o’clock]. It follows that I should know that it’s 2 o’clock after updating, via the epistemic closure principle. This remains true even if I don’t know my evidence and don’t know that I know it’s (1,2,3 o’clock)
Now it’s possible to deny epistemic closure in this case, but then how am I able to know before updating that I shouldn’t take the bet? In order to know that I shouldn’t bet, it would seem that I have to know that:
If [(Its 1,2,3 o’clock) & (if it’s 1,2,3 o’clock then it’s 2 o’clock)] then it’s 2 o’clock. For otherwise I couldn’t know that my bet would be erroneous in the 2 o’clock case.
But if I have deductive closure when forming my prior, then I surely have it when forming my posterior. Unless we’re postulating a case of knowledge loss, but then I don’t see how that’s supposed to be arguing against the principle of total evidence
P.S sorry for any silly mistakes on my part, it’s late here! 😜
There's something weird with the clock example. You know that "if it's actually 3, then I'll know it's either 2, 3 or 4 (but nothing more)". The paradox seems to imply the converse: "If I know it's either 2,3 or 4 (but nothing more), then it's actually 3". This is how you can be sure that you'll lose your bet. But knowing the converse leads to a contradiction, because then you can infer from the fact that you know it's either 2,3 or 4 (but nothing more) that it's exactly 3, which contradicts the "nothing more" clause. So, I think this example doesn't make sense, it's ill-posed, and I'm wondering if you couldn't find a similar problem with the previous cases.