Hypotheticals, counterfactuals and probabilityPosted: 29 January 2017 | |
This essay considers the notion of events occurring that we do not know to either have occurred, or to be almost certain to occur in the future. Imagination of such events is everywhere in everyday speech, but we rarely stop to consider what we mean by it, or what effect imagining such things has on us.
It is dotted with numbered questions, so it can be used as a basis for a discussion.
A counterfactual is where we imagine something happening that we know did not happen.
This is fertile ground for fiction. Philip K Dick’s acclaimed novel ‘The Man in the High Castle’, written in 1962, depicts events in a world in which the Axis powers won World War II, and the USA has been divided into parts occupied by Japan and Germany. The movie ‘Sliding Doors’ is another well-known example, that imagines what ‘might have happened’ if Gwyneth Paltrow’s hadn’t missed a train by a second as the sliding doors closed in front of her..
When something terrible happens, many people torment themselves by considering what would have happened if they, or somebody else, had done something differently:
- What if I had been breathing out rather than in when the airborne polio germ floated by? (from Alan Marshall’s ‘I can jump puddles’)
- If she hadn’t missed her flight and had to catch the next one (doomed to crash), she’d still be alive now.
- What would life have been like if I hadn’t broken up with Sylvie / Serge?
We can also consider counterfactuals where the outcome would have been worse than what really happened, such as ‘What would my life have been like if I hadn’t met that inspirational person that helped me kick my heroin habit‘. But for some reason – so it appears to me – most counterfactuals that we entertain are where the real events are worse than the imagined ones. We could call these ‘regretful counterfactuals‘ and the other ones ‘thankful counterfactuals‘.
Then there are the really illogical-seeming ones, like the not-uncommon musing: ‘Who would I be [or what would I be like] if my parents were somebody else?‘ which makes about as much sense as ‘what would black look like if it were a lightish colour?‘
Here are some questions:
- why do we entertain counterfactuals? What, if any, benefits are there from considering regretful counterfactuals? What about thankful ones?
- given that for many counterfactuals, consideration of them just makes us feel bad, could we avoid entertaining them, or is it too instinctive an urge to be avoidable?
- Do counterfactuals have any meaning? Given that Alan Marshall did breathe in, and did contract polio, what does it mean to ask ‘If he had been breathing out instead, would he have become a top-level athlete rather than an author?‘ Are we in that case talking about a person – real or imaginary – other than Alan Marshall, since part of what made him who he is, was his polio?
That last question can lead in some very odd directions. My pragmatic approach is that counterfactuals are made-up stories about an imaginary universe that is very similar to this one, but in which slightly different things happen. Just as we make up stories about non-existent lands, princesses and far away galaxies, we can make up stories about imaginary worlds that are very similar to this one except in a handful of crucial respects.
Some philosophers insist that counterfactuals are not about imaginary people and worlds but about the real people we know. My objection to that is that, for example, the Marshall counterfactual cannot be about the Alan Marshall, because he had polio. It can only be about an imaginary boy whose life was almost identical to Marshall’s up the point when the real one contracted polio. My opponents (who would include Saul Kripke, that we mention later) would counter that polio is not what defines Alan Marshall, that it is an ‘inessential’ aka ‘accidental’ property of that person, and changing it would not change his being that person. Which begs the question of what, if any, properties are essential, such that changing them would make the subject a different person. Old Aristotle believed that objects, including people, have essential and inessential properties, and wrote reams about that. In the Middle Ages Thomas Aquinas picked up on that and wrote many more reams about it. The ‘essential properties’ of an object are called its ‘essence’, and believing in such things is called ‘Essentialism’. That is how certain RC theologians are able to claim that an object that looks, feels, smells, sounds, tastes and behaves like a small, round, white wafer is actually the body of Jesus of Nazareth – apparently because, although every property we can discern is that of a wafer, the ‘essential’ properties (which we cannot perceive) are those of Jesus, thus its essence is that of Jesus. I tried for years to make sense of that and believe it, but all it succeeded in doing was giving me a headache and making me sad. For me, essentialism is bunk.
- Can you make any sense of Essentialism? If so can you help those of us who can’t, to understand it?
I can’t help but muse that maybe thankful counterfactuals have some practical value, as they can enable us to put our current sorrows into perspective. They are a very real way of Operationalizing (I know, right?) what Garrison Keillor suggests is the Minnesotan state motto – ‘It could be worse‘.
Maybe regretful counterfactuals sometimes have a role too, when they encourage us to learn from our mistakes and be more careful in the future. But they are of no use in the three examples given above. What are we going to learn from them: Never breathe in? Never fly on an aeroplane? Never break up with a romantic partner (no matter how unsuitable the match turns out to be)?
If we do something that leads to somebody else suffering harm, considering the regretful counterfactual can be useful. If I hadn’t done that, they wouldn’t be so sad. How can I make it up to them? I know, I’ll do such-and-such. That won’t fix it completely, but it’s all I can think of and at least it’ll make them feel somewhat better.
But once we’ve done all we can along those lines, the counterfactual has outlived its usefulness and is best dismissed. Otherwise we end up punishing ourselves with pointless guilt, which benefits nobody. Yet we so often do this anyway, perhaps because we can’t help it, as speculated in question 2.
I am completely useless at banishing guilt. But the techniques I have, feeble as they are, revolve around reminding myself that the universe is as it is, and cannot be otherwise. The past cannot be changed. If I had not done that hurtful thing I would not have been who I am, and the universe would be a different one, not this one. I am sorry I did it, and will do my best to make restitution, and to avoid causing harm in that way again. But the counterfactual of my not doing it is just an imaginary story about a different universe, that is (once I’ve covered the restitution and self-improvement aspects) of no use to anybody, and not even a good story. Better to read about Harry Potter’s imaginary universe instead.
This universe-could-not-have-been-otherwise approach is currently working moderately well in helping me cope with the recent Fascist ascendancy in the US. There are so many ‘if only…’ situations we could torture ourselves with: ‘If only the Democrats had picked Bernie Sanders’, ‘If only Ms Clinton hadn’t made the offhand comment about the basket of deplorables’, ‘If only the Republicans had picked John Kasich’. Those ‘If only’s are about a different universe, not this one. They could not happen in this universe, because in this universe they didn’t happen.
Counterfactuals also come into Quantum Mechanics. Arguably the most profound and shocking finding of quantum mechanics is Bell’s Theorem which, together with the results of a series of experiments that physicists did after the theorem was published, implies that either influences can travel faster than light – which appears to destroy the theory of relativity that is the basis of much modern physics – or Counterfactual Definiteness is not true. Counterfactual Definiteness states that we can validly and meaningfully reason about what would have been the result if, in a given experiment, a scientist had made a different type of measurement from the one she actually made – eg if she had pointed a particular measuring device in a different direction. Many find it ridiculous that we cannot validly consider what would have happened in such an alternative experiment, but that (or the seemingly equally ridiculous alternative of faster-than-light influences) is what Bell’s Theorem tells us, and the maths has been checked exhaustively.
A counterfactual deals with the case where something happens that we know did not happen. What about when we don’t know? I use the word hypothetical or possibility to refer to where we consider events which we do not know whether or not they occur in the history of the universe. These events may be past or future:
- a past hypothetical is that Lee Harvey Oswald shot JFK from the book depository window. Some people believe he did. Others think the shot came from the ‘grassy knoll’.
- a future hypothetical is that the USA will have a trade war against China
What do we mean when we say those events are ‘possible’ or, putting it differently, that they ‘could have happened‘ (for past hypotheticals) or that they ‘could happen‘ (for future hypotheticals)? I suggest that we are simply indicating our lack of knowledge. That is, we are saying that we cannot be certain whether, in a theoretical Complete History of the Earth, written by omniscient aliens after the Earth has been engulfed by the Sun and vaporised, those events would be included.
Some people would insist that the future type is different from the past type – that while a past hypothetical is indeed just about a lack of knowledge about what actually happened, a future hypothetical is about something more fundamental and concrete than just knowledge. This leads me to ask:
- Does saying that a certain event is ‘possible’ in the future indicate anything more than a lack of certainty on the part of the speaker as to whether it will occur? If so, what?
I incline to the view that it indicates nothing other than the speaker’s current the state of knowledge. What some people find uncomfortable about that is that it makes the notion of possibility depend on who is speaking. For a medieval peasant it is impossible that an enormous metal device could fly. For a 21st century person it is not only possible but commonplace. As Arthur C Clarke said ‘Any sufficiently advanced technology is indistinguishable from magic.’ To us, mind-reading is impossible, but maybe in five hundred years we will be able to buy a device at the chemist for five dollars that reads people’s minds by measuring the electrical fields emitted by their brain.
Under this view, the notion of possibility is mind-dependent. What would a mind-independent notion of possibility be?
There is a whole branch of philosophy called ‘Modal Logic’, and an associated theory of language – from the brilliant logician Saul Kripke – that is based on the notion that possibility means something deep and fundamental that is not just about knowledge, or minds. To me the whole thing seems as meaningful as debates over how many angels can dance on the head of a pin, but maybe one day I will meet somebody that can demonstrate a meaning to such word games.
Sometimes counterfactuals sound like past possibilities. That happens when we say that something which didn’t happen, could have happened. Marlon Brando’s character Terry in ‘On the Waterfront‘ complains ‘I coulda been a contender … instead of a bum, which is what I am‘. As I said above, I don’t think it makes literal sense to say it could have happened, since it didn’t. But if we didn’t know whether it had happened or not, we wouldn’t have been surprised to find out that it did happen. So in a sense we are saying that a person in the past, prior to when the event did or didn’t occur, evaluating it from that perspective, would regard it as possible. Brando’s Terry was saying that, back in the early days of his boxing career, he would not have been at all surprised if he had become a star. But he didn’t, and now it was too late.
What would happen / have happened next?
With both counterfactuals and hypotheticals, we often ask whether some other thing would have happened if the first thing had happened differently from how it did. For instance:
- [counterfactual] If the FBI director had not announced an inquiry into Hilary Clinton’s emails days before the 2016 US presidential election, would she have won?
- [past hypothetical] If Henry V really did give a stirring speech like the ‘band of brothers’ one in Shakespeare’s play, exhorting his men to fight just for the glory of having represented England, God and Henry, were any of the men cynical about his asking them to risk death just in order to increase Henry’s personal power?
- [future hypothetical] If Australia’s Turnbull government continues with its current anti-environment policies, will it be trounced at the next election?
Which leads to another question:
- What exactly do these questions mean?
The first relates to something that we know did not happen and the other two relate to what is currently unknowable.
My opinion is that, like with counterfactuals, they are about making up stories. In the US election case we are imagining a story in which certain events in the election were different, and we are free, within the bounds of the constraints imposed by what we know of the laws of nature, to imagine what happened next. Perhaps in the story Ms Clinton wins. Perhaps she then goes on to become the most beloved and successful president the country has ever had, overseeing a resurgence of employment, creativity, and brotherly and sisterly love never before encountered. Or perhaps she declares martial law, suspends the constitution and becomes dictator for life, building coliseums around the country where Christians and men are regularly fed to lions. Within the bounds of the laws of nature we are free to make up whatever story we like.
The same goes for the past hypothetical of Henry’s speech. We can imagine the men swooning in awe and devotion, murmuring Amen after every sentence, or we can imagine them rolling their eyes and making bawdy, cynical quips to one another – but nevertheless eventually going in to battle because otherwise they won’t be paid and their families will starve.
However, the future hypothetical seems to be about more than a made-up story. If the first thing happens – continued anti-environmentalism – then we will definitely know after the next election whether the second thing has also happened. At that point it becomes a matter of fact rather than imagination.
To which I say, so what? Until it happens, or else it becomes clear that it will not happen, it is a matter of future possibilities and can be covered by any of the scientifically-valid imaginative scenarios we can dream up. It is only if the scientific constraint massively narrows down those scenarios that it has significance. If, for instance, we could be sure that any government that fails to make a credible attempt to protect the environment will be booted out office, our future possibility would become a certainty: If the government doesn’t change its track then it will be ejected. But in politics nothing is ever that certain. Other issues come up and change the agenda, scandals happen, natural and man-made disasters, personal retirements and deaths of key politicians. At best we can talk about whether maintaining the anti-environment stance makes it more probable that the government will lose office. Which leads on to the next thorny issue.
Probability, aka chance, aka risk, aka likelihood and many other synonyms and partial synonyms, is a word that most people feel they know what it means, but nobody can explain what that is.
What do we mean when we say that the probability of a tossed coin giving heads is 0.5? Introductory probability courses often explain this by saying that if we did a very large number of tosses we would expect about half of them to be heads. But if we ask what ‘expect’ means we find ourselves stuck in a circular definition. Why? Because what we ‘expect’ is what we consider most ‘likely’, which is the outcome that has the highest ‘probability’. We cannot define ‘probability’ without first defining ‘expect’, and we cannot define ‘expect’ without first defining ‘probability’ or one of its synonyms.
We could try to escape by saying that what we ‘expect’ is what we think will happen, only that would be wrong. The word ‘will’ is too definite here, implying certainty. When we say we expect a die will roll a number less than five, we are not saying that we are certain that will be the case. If it were, and we rolled the die one hundred times in succession, we would have that expectation before each roll, so we would be certain that no fives or sixes occurred in the hundred rolls. Yet the probability of getting no fives or sixes in a hundred rolls is about two in a billion billion, which is not very likely at all. We could dispense with the ‘certainty’ and instead say that we think a one, two, three or four is the ‘most probable’ outcome for the next roll. But then we’re back in the vicious circle, as we need to know what ‘probable’ means.
- What does ‘expected’ mean?
There is a formal mathematical definition of probability, that removes all vagueness from a mathematical point of view, and enables us to get on with any calculation, however complex. Essentially it says that ‘probability’ is any scheme for assigning a number between 0 and 1 to every imaginable outcome (note how I carefully avoid using the word ‘possible’ here), in such a way that the sum of the numbers for all the different imaginable outcomes is 1.
But that definition tells us nothing about how we assign numbers to outcomes. It would be just as valid to assign 0.9 to heads and 0.1 to tails as it would to assign 0.5 to both of them. Indeed, advanced probability of the kind used in pricing financial instruments involves using more than one different scheme at the same time, which assign different numbers (probabilities) to the same outcome.
This brings us no closer to understanding why we assign 0.5 to heads.
Another approach is to say that we divide up the set of all potential outcomes as finely as we can, so that every outcome is equally likely. Then if the number of ‘equally likely’ outcomes is N, we assign the probability 1/N to each one.
That seems great until we ask what ‘equally likely’ means, and then realise (with a sickening thud) that ‘equally likely’ means ‘has the same probability as’, which means we’re stuck in a circular definition again.
- What does ‘equally likely’ mean?
After much running around in metaphorical circles, I have come to the tentative conclusion that ‘likely’ is a concept that is fundamental to how we interpret the world, so fundamental that it transcends language. It cannot be defined. There are other words like this, but not many. Most words are defined in terms of other words, but in order to avoid the whole system becoming circular, there must be some words that are taken as understood without definition – language has to start somewhere. Other examples might be ‘feel’, ‘think’ and ‘happy’. We assume that others know what is meant by each of these words, or a synonym thereof, and if they don’t then communication is simply impossible on any subject that touches on the concept.
Or perhaps ‘likely’ and ‘expect’ may be best related to a (perhaps) more fundamental concept, which is that of ‘confidence’, and its almost-antonym ‘surprise’. Something is ‘likely’ if we are confident – but not necessarily certain – that it will happen, which is that we would be somewhat surprised – but not necessarily dumbfounded – if it did not happen. I think the twin notions of confidence and surprise may be fundamental because even weeks-old babies seem to understand surprise. The game of peek-a-boo relies on it entirely.
Once we have these concepts, I think we may be able to bootstrap the entire probability project. The six imaginable dice roll numbers will be equally likely if we would be very surprised if out of six million rolls, any of the numbers occurred more than two million times, or not at all.
There are various frameworks for assigning probabilities to events that are discussed by philosophers thinking about probability. The most popular are
- the Frequentist framework, which bases the probability of an event on the number of times it has been observed to occur in the past;
- the Bayesian approach, which starts with an intuitively sensed prior probability, and then adjusts to take account of subsequent observations that using Bayes’ Law; and
- the Symmetry approach, which argues that events that are similar to one another via some symmetry should have the same probability.
It would make this essay much too long to go into any of these in greater detail. But none of them lay out a complete method. I suspect they all have a role to play in how we intuitively sense probabilities of certain simple events. But I feel that there is still some fundamental, unanalysable concept of confidence vs surprise that is needed to cover the gaps left by the large vague areas in each framework.
Here is one last question to consider:
- A surgeon tells a parent that their three-year old daughter, who is in a coma with internal abdominal bleeding following a car accident, has a 98% chance of a successful outcome of the operation, with complete recovery of health. In the light of the above discussion, it seems that nobody can explain what that 98% means. Yet despite the lack of any explicable meaning, the parent is so relieved that they dissolve in tears. Why?
Bondi Junction, January 2017