About the Devil

What a strange concept is Satan, the devil! He occupies such a large part in Western culture and literature that few people ever stop to reflect on the weirdness of the idea of a single person that is responsible for all the bad things in the world. Certainly I never considered it until the other day.

Satan seems to me to be a particularly Christian concept, in emphasis if not in origin. Other religions have supernatural beings with varying degrees of benevolence or malevolence, but I don’t think the concept of a single Prince of Darkness is widespread. China, India and Japan have mythologies replete with good and evil spirits, but no single spirit has control of all the bad stuff. For instance the Ramayana’s Rawanna, king of demons, comes across as more naughty than evil. So does Loki from Norse mythology. Actually Rawanna is a good deal less frightening to me than the Indian goddesses of destruction Kali and Durga, both of whom are worshipped by perfectly nice, law-abiding, kind people. I get the two mixed up, but one or both of them wears a necklace of skulls and is portrayed dancing on the bodies of those she has slain.

Chinese folk religion seems to encompass a multitude of evil spirits. We have to orient our houses the right way and do specific things with water, air, numbers and chants to keep them at bay.

Having multiple bad spirits seems to me like having a proliferation of petty criminals, whereas Satan is more like how Stalin appeared to the West at the height of the Cold War – the supreme leader of a tremendously powerful organisation capable of doing unfathomable harm. Some people, perhaps out of nostalgia for the good old days of the cold war, tried to resurrect that image with people like Osama Bin Laden, but it never really caught on. Stalin could have killed hundreds of millions just by pressing a button. Osama Bin Laden – to be blunt – couldn’t.

Satan does get the occasional mention in the two other Abrahamic religions: Judaism and Islam. After all, he makes his debut appearance in the Garden of Eden story in Genesis chapter three, which all three Abrahamic religions share.

But while I don’t know a whole lot about either Judaism or Islam, a bit of googling about devils in Judaism and Islam didn’t turn up anything with a prominence like the following from the RC baptismal rite:

      Celebrant:Do you reject Satan?

      Parents & Godparents: I do.

      Celebrant: And all his works?

      Parents & Godparents: I do.

      Celebrant: And all his empty promises?

      Parents & Godparents: I do.

You know you’ve made it to the big-time when an organisation with a billion members makes three successive references to you in the induction ceremony for every one of its new members.

The biggest speaking part that Satan is given in the Jewish scriptures – to my knowledge – is in the Book of Job, where he wagers with God (Yahweh) about whether Job can be induced to curse Yahweh if enough suffering is inflicted on him. Job is a fascinating book, partly because Satan comes across in it as a less malevolent entity than Yahweh. But I wouldn’t want my fate to be in the hands of either of them, as portrayed therein. There’s not a lot of ‘Duty of Care‘ going on.

Satan’s literary influence is so pervasive that his avatars pop up in secular literature as well. Classic instances of that are Sauron from Lord of the Rings and Voldemort from Harry Potter. Both are known as ‘the Dark Lord’ and have the honorific ‘Lord’ affixed before their name. One might ascribe Tolkien’s symbolism in Lord of the Rings to his Roman Catholicism. On the other hand JK Rowling is not a Christian, yet her use of such a clear Satan substitute shows how deeply embedded the role of Satan has become in Western culture, both religious and secular.

From a literary standpoint, having the notion of Satan is a wonderful cultural advantage. Pitting the hero(s) against the overwhelming odds of a leader of a massively powerful army of evil is so much more gripping than against a mere mortal villain.

Both Tolkien and Rowling hedged their bets a bit though. Both allude in their mythologies to earlier evils, which in some sense detracts from the uniqueness of their Dark Lords. With Tolkien it was Morgoth, while Rowling had Grindelwald. I think the latter name is unfortunate because Grindelwald is the name of a lovely village in the Swiss alps. I went there about thirty-five years ago and did not encounter any dark forces. But maybe it has changed since then.

Western culture has a rich tradition of tales about Satan and his followers. That has given us such chilling works of fiction as Rosemary’s Baby, The Omen and The Exorcist. The English author Dennis Wheatley wrote a series of best-selling novels about Satanist cults conjuring up the devil in gothic country mansions, sacrificing virgins and doing other dastardly deeds. Going back further in time we have Goethe’s Faust, Milton’s Paradise Lost and Dante’s Inferno. I have not read any of these three, although I know the story of Faust. I am keen to find time to read Paradise Lost (if only it weren’t so LONG!) because it is said that it presents Satan as a complex, multi-faceted character that is in some senses almost a tragic hero, rather than just the pure evil image to which we are generally subjected.

I don’t think I ever really believed in the devil entirely, although I said I did, both to others and to myself, because that was a requirement of the religion in which I was raised. I found Rosemary’s Baby and The Omen scary, but not so much The Exorcist.

On reflection, I think what scared me most about Rosemary’s Baby and The Omen was not the devil but rather his creepy followers. In Rosemary’s baby it was the solicitous, secretly Satan-worshipping neighbours that befriended Rosemary and took advantage of her trusting nature to gradually poison her with various herbs that they told her were medicine for a mild ailment she had. In The Omen it was the creepily motherly yet homicidal nursemaid, plus the kid Damian (son of Satan) who killed people by doing things like crashing his tricycle into them at the top of the stairs so that they fell and broke their neck. Malevolent children are always scary, regardless of whether any evil spirits are in sight. Horror movies love to make use of them, and ‘Lord of The Flies‘ is like the apotheosis of the evil children genre. How apt that William Golding chose the title ‘Lord of The Flies‘ for that novel, which is a translation of Beelzebub, one of Satan’s many names.

The book that scared me the most was Dracula. I read it at much too young an age and spent the next several years sleeping in terror with my head under the blankets to try to keep the vampires away. As far as I recall Dracula doesn’t actually mention the devil at all. Count Dracula is evil, but there is no suggestion in Bram Stoker’s book that he is unique.

I wonder what it is that made Satan such a prominent figure in Christianity, and the cultures that were heavily influenced by Christianity.

One theory I’ve come across is that, when Christianity became the official religion of the Roman Empire in the early 4th century CE, the Romans sought to spread the religion by discrediting the existing folk religions of Europe, many of which involved some sort of worship of nature and fertility. The sort of ‘dancing in a circle, naked in the woods‘ image that is associated in modern times with witches’ covens and satanic cults (in the popular imagination at least – whether also in reality I have no idea) may have been a feature of those pre-Christian folk religions. So associating them with a powerful evil figure would have been a way to discredit those religions, and maybe to justify suppressing them.

That theory has an intuitive appeal, but strikes the problem that for many of those folk religions the anthropomorphic image of nature that was worshipped was female – a mother goddess. Yet Satan is male.

An equally plausible, and rather simpler, explanation is that the notion of an immensely powerful Dark Lord just makes for a great story, and great stories make for successful social movements.

There’s an interesting theological conflict between the notion of Satan as the Embodiment of Evil on the one hand, and on the other, the Roman Catholic doctrine that evil is an absence of good, rather than a presence of something bad. The origin of that doctrine is first attributed to St Augustine (late 4th century), and later reinforced by St Thomas Aquinas (13th century). I find it hard to square this with Satan being an entity that is supposed to be actively evil. I have no idea what RC theologians make of this, although I am confident that their explanation would be extremely LONG. Make an explanation long enough and the chances are that the explainee will not raise any objections, out of sheer weariness and the fear that the explainer may launch into a further diatribe.

When I was in high school there was a boy who attended our school for a short while. He gained notoriety by telling people that he had seen the Devil. He had just woken up in the middle of the night and the Devil had been standing there at the foot of his bed. I remember that he had red eyes (the devil, not the boy), but can’t recall any other details being given. Still, the red eyes would be enough to narrow down the suspects fairly effectively if the police were to conduct a manhunt. A short conversation was had, twixt the boy and the devil. I don’t remember the topic but I do remember that it was surprisingly banal.

Did he really think he saw the devil, or did he just make up the story in order to gain attention and acceptance at a new school, as boys are wont to do? We’ll never know, because he left after being there only a couple of months. I hope it was made up, because such visions are often associated with mental illness and I wouldn’t wish for him to have suffered that.

Having meandered about all over the place in this essay (as usual) I feel I should lay my cards on the table and say that, although I think the devil is a marvellous literary figure that we couldn’t do without, I don’t believe in him any more. I hope that most other people don’t either, regardless of their religious or cultural associations, as belief in the devil seems to lead to black and white thinking that before you know it has medicine women being burned as witches and teenagers with schizophrenia being subjected to horrific exorcism rituals.

What I do believe in, at least in the middle of the night as I struggle out of bed to go and empty my bladder, is a frightful monster hiding under the bed with scaly claws that will grab my shins, pull me under the bed and then – I don’t know what then, but no doubt it will be horrific. But that’s more Doctor Who than Paradise Lost, and is the subject of another (not yet written) essay.

Now, having whinged shamelessly about the verbosity of both John Milton and of theologians, I had better stop here, lest I commit the very misdemeanor I have been moaning about.

Andrew Kirk

Bondi Junction, May 2017


Hypotheticals, counterfactuals and probability

This essay considers the notion of events occurring that we do not know to either have occurred, or to be almost certain to occur in the future. Imagination of such events is everywhere in everyday speech, but we rarely stop to consider what we mean by it, or what effect imagining such things has on us.

It is dotted with numbered questions, so it can be used as a basis for a discussion.


A counterfactual is where we imagine something happening that we know did not happen.

This is fertile ground for fiction. Philip K Dick’s acclaimed novel ‘The Man in the High Castle’, written in 1962, depicts events in a world in which the Axis powers won World War II, and the USA has been divided into parts occupied by Japan and Germany. The movie ‘Sliding Doors’ is another well-known example, that imagines what ‘might have happened’ if Gwyneth Paltrow’s hadn’t missed a train by a second as the sliding doors closed in front of her..

When something terrible happens, many people torment themselves by considering what would have happened if they, or somebody else, had done something differently:

  • What if I had been breathing out rather than in when the airborne polio germ floated by? (from Alan Marshall’s ‘I can jump puddles’)
  • If she hadn’t missed her flight and had to catch the next one (doomed to crash), she’d still be alive now.
  • What would life have been like if I hadn’t broken up with Sylvie / Serge?

We can also consider counterfactuals where the outcome would have been worse than what really happened, such as ‘What would my life have been like if I hadn’t met that inspirational person that helped me kick my heroin habit‘. But for some reason – so it appears to me – most counterfactuals that we entertain are where the real events are worse than the imagined ones. We could call these ‘regretful counterfactuals‘ and the other ones ‘thankful counterfactuals‘.

Then there are the really illogical-seeming ones, like the not-uncommon musing: ‘Who would I be [or what would I be like] if my parents were somebody else?‘ which makes about as much sense as ‘what would black look like if it were a lightish colour?

Here are some questions:

  1. why do we entertain counterfactuals? What, if any, benefits are there from considering regretful counterfactuals? What about thankful ones?
  2. given that for many counterfactuals, consideration of them just makes us feel bad, could we avoid entertaining them, or is it too instinctive an urge to be avoidable?
  3. Do counterfactuals have any meaning? Given that Alan Marshall did breathe in, and did contract polio, what does it mean to ask ‘If he had been breathing out instead, would he have become a top-level athlete rather than an author?‘ Are we in that case talking about a person – real or imaginary – other than Alan Marshall, since part of what made him who he is, was his polio?

That last question can lead in some very odd directions. My pragmatic approach is that counterfactuals are made-up stories about an imaginary universe that is very similar to this one, but in which slightly different things happen. Just as we make up stories about non-existent lands, princesses and far away galaxies, we can make up stories about imaginary worlds that are very similar to this one except in a handful of crucial respects.

Some philosophers insist that counterfactuals are not about imaginary people and worlds but about the real people we know. My objection to that is that, for example, the Marshall counterfactual cannot be about the Alan Marshall, because he had polio. It can only be about an imaginary boy whose life was almost identical to Marshall’s up the point when the real one contracted polio. My opponents (who would include Saul Kripke, that we mention later) would counter that polio is not what defines Alan Marshall, that it is an ‘inessential’ aka ‘accidental’ property of that person, and changing it would not change his being that person. Which begs the question of what, if any, properties are essential, such that changing them would make the subject a different person. Old Aristotle believed that objects, including people, have essential and inessential properties, and wrote reams about that. In the Middle Ages Thomas Aquinas picked up on that and wrote many more reams about it. The ‘essential properties’ of an object are called its ‘essence’, and believing in such things is called ‘Essentialism’. That is how certain RC theologians are able to claim that an object that looks, feels, smells, sounds, tastes and behaves like a small, round, white wafer is actually the body of Jesus of Nazareth – apparently because, although every property we can discern is that of a wafer, the ‘essential’ properties (which we cannot perceive) are those of Jesus, thus its essence is that of Jesus. I tried for years to make sense of that and believe it, but all it succeeded in doing was giving me a headache and making me sad. For me, essentialism is bunk.

  1. Can you make any sense of Essentialism? If so can you help those of us who can’t, to understand it?

I can’t help but muse that maybe thankful counterfactuals have some practical value, as they can enable us to put our current sorrows into perspective. They are a very real way of Operationalizing (I know, right?) what Garrison Keillor suggests is the Minnesotan state motto – ‘It could be worse‘.

Maybe regretful counterfactuals sometimes have a role too, when they encourage us to learn from our mistakes and be more careful in the future. But they are of no use in the three examples given above. What are we going to learn from them: Never breathe in? Never fly on an aeroplane? Never break up with a romantic partner (no matter how unsuitable the match turns out to be)?

If we do something that leads to somebody else suffering harm, considering the regretful counterfactual can be useful. If I hadn’t done that, they wouldn’t be so sad. How can I make it up to them? I know, I’ll do such-and-such. That won’t fix it completely, but it’s all I can think of and at least it’ll make them feel somewhat better.

But once we’ve done all we can along those lines, the counterfactual has outlived its usefulness and is best dismissed. Otherwise we end up punishing ourselves with pointless guilt, which benefits nobody. Yet we so often do this anyway, perhaps because we can’t help it, as speculated in question 2.

I am completely useless at banishing guilt. But the techniques I have, feeble as they are, revolve around reminding myself that the universe is as it is, and cannot be otherwise. The past cannot be changed. If I had not done that hurtful thing I would not have been who I am, and the universe would be a different one, not this one. I am sorry I did it, and will do my best to make restitution, and to avoid causing harm in that way again. But the counterfactual of my not doing it is just an imaginary story about a different universe, that is (once I’ve covered the restitution and self-improvement aspects) of no use to anybody, and not even a good story. Better to read about Harry Potter’s imaginary universe instead.

This universe-could-not-have-been-otherwise approach is currently working moderately well in helping me cope with the recent Fascist ascendancy in the US. There are so many ‘if only…’ situations we could torture ourselves with: ‘If only the Democrats had picked Bernie Sanders’, ‘If only Ms Clinton hadn’t made the offhand comment about the basket of deplorables’, ‘If only the Republicans had picked John Kasich’. Those ‘If only’s are about a different universe, not this one. They could not happen in this universe, because in this universe they didn’t happen.

Counterfactuals also come into Quantum Mechanics. Arguably the most profound and shocking finding of quantum mechanics is Bell’s Theorem which, together with the results of a series of experiments that physicists did after the theorem was published, implies that either influences can travel faster than light – which appears to destroy the theory of relativity that is the basis of much modern physics – or Counterfactual Definiteness is not true. Counterfactual Definiteness states that we can validly and meaningfully reason about what would have been the result if, in a given experiment, a scientist had made a different type of measurement from the one she actually made – eg if she had pointed a particular measuring device in a different direction. Many find it ridiculous that we cannot validly consider what would have happened in such an alternative experiment, but that (or the seemingly equally ridiculous alternative of faster-than-light influences) is what Bell’s Theorem tells us, and the maths has been checked exhaustively.


A counterfactual deals with the case where something happens that we know did not happen. What about when we don’t know? I use the word hypothetical or possibility to refer to where we consider events which we do not know whether or not they occur in the history of the universe. These events may be past or future:

  • a past hypothetical is that Lee Harvey Oswald shot JFK from the book depository window. Some people believe he did. Others think the shot came from the ‘grassy knoll’.
  • a future hypothetical is that the USA will have a trade war against China

What do we mean when we say those events are ‘possible’ or, putting it differently, that they ‘could have happened‘ (for past hypotheticals) or that they ‘could happen‘ (for future hypotheticals)? I suggest that we are simply indicating our lack of knowledge. That is, we are saying that we cannot be certain whether, in a theoretical Complete History of the Earth, written by omniscient aliens after the Earth has been engulfed by the Sun and vaporised, those events would be included.

Some people would insist that the future type is different from the past type – that while a past hypothetical is indeed just about a lack of knowledge about what actually happened, a future hypothetical is about something more fundamental and concrete than just knowledge. This leads me to ask:

  1. Does saying that a certain event is ‘possible’ in the future indicate anything more than a lack of certainty on the part of the speaker as to whether it will occur? If so, what?

I incline to the view that it indicates nothing other than the speaker’s current the state of knowledge. What some people find uncomfortable about that is that it makes the notion of possibility depend on who is speaking. For a medieval peasant it is impossible that an enormous metal device could fly. For a 21st century person it is not only possible but commonplace. As Arthur C Clarke said ‘Any sufficiently advanced technology is indistinguishable from magic.’ To us, mind-reading is impossible, but maybe in five hundred years we will be able to buy a device at the chemist for five dollars that reads people’s minds by measuring the electrical fields emitted by their brain.

Under this view, the notion of possibility is mind-dependent. What would a mind-independent notion of possibility be?

There is a whole branch of philosophy called ‘Modal Logic’, and an associated theory of language – from the brilliant logician Saul Kripke – that is based on the notion that possibility means something deep and fundamental that is not just about knowledge, or minds. To me the whole thing seems as meaningful as debates over how many angels can dance on the head of a pin, but maybe one day I will meet somebody that can demonstrate a meaning to such word games.

Sometimes counterfactuals sound like past possibilities. That happens when we say that something which didn’t happen, could have happened. Marlon Brando’s character Terry in ‘On the Waterfront‘ complains ‘I coulda been a contender … instead of a bum, which is what I am‘. As I said above, I don’t think it makes literal sense to say it could have happened, since it didn’t. But if we didn’t know whether it had happened or not, we wouldn’t have been surprised to find out that it did happen. So in a sense we are saying that a person in the past, prior to when the event did or didn’t occur, evaluating it from that perspective, would regard it as possible. Brando’s Terry was saying that, back in the early days of his boxing career, he would not have been at all surprised if he had become a star. But he didn’t, and now it was too late.

What would happen / have happened next?

With both counterfactuals and hypotheticals, we often ask whether some other thing would have happened if the first thing had happened differently from how it did. For instance:

  • [counterfactual] If the FBI director had not announced an inquiry into Hilary Clinton’s emails days before the 2016 US presidential election, would she have won?
  • [past hypothetical] If Henry V really did give a stirring speech like the ‘band of brothers’ one in Shakespeare’s play, exhorting his men to fight just for the glory of having represented England, God and Henry, were any of the men cynical about his asking them to risk death just in order to increase Henry’s personal power?
  • [future hypothetical] If Australia’s Turnbull government continues with its current anti-environment policies, will it be trounced at the next election?

Which leads to another question:

  1. What exactly do these questions mean?

The first relates to something that we know did not happen and the other two relate to what is currently unknowable.

My opinion is that, like with counterfactuals, they are about making up stories. In the US election case we are imagining a story in which certain events in the election were different, and we are free, within the bounds of the constraints imposed by what we know of the laws of nature, to imagine what happened next. Perhaps in the story Ms Clinton wins. Perhaps she then goes on to become the most beloved and successful president the country has ever had, overseeing a resurgence of employment, creativity, and brotherly and sisterly love never before encountered. Or perhaps she declares martial law, suspends the constitution and becomes dictator for life, building coliseums around the country where Christians and men are regularly fed to lions. Within the bounds of the laws of nature we are free to make up whatever story we like.

The same goes for the past hypothetical of Henry’s speech. We can imagine the men swooning in awe and devotion, murmuring Amen after every sentence, or we can imagine them rolling their eyes and making bawdy, cynical quips to one another – but nevertheless eventually going in to battle because otherwise they won’t be paid and their families will starve.

However, the future hypothetical seems to be about more than a made-up story. If the first thing happens – continued anti-environmentalism – then we will definitely know after the next election whether the second thing has also happened. At that point it becomes a matter of fact rather than imagination.

To which I say, so what? Until it happens, or else it becomes clear that it will not happen, it is a matter of future possibilities and can be covered by any of the scientifically-valid imaginative scenarios we can dream up. It is only if the scientific constraint massively narrows down those scenarios that it has significance. If, for instance, we could be sure that any government that fails to make a credible attempt to protect the environment will be booted out office, our future possibility would become a certainty: If the government doesn’t change its track then it will be ejected. But in politics nothing is ever that certain. Other issues come up and change the agenda, scandals happen, natural and man-made disasters, personal retirements and deaths of key politicians. At best we can talk about whether maintaining the anti-environment stance makes it more probable that the government will lose office. Which leads on to the next thorny issue.


Probability, aka chance, aka risk, aka likelihood and many other synonyms and partial synonyms, is a word that most people feel they know what it means, but nobody can explain what that is.

What do we mean when we say that the probability of a tossed coin giving heads is 0.5? Introductory probability courses often explain this by saying that if we did a very large number of tosses we would expect about half of them to be heads. But if we ask what ‘expect’ means we find ourselves stuck in a circular definition. Why? Because what we ‘expect’ is what we consider most ‘likely’, which is the outcome that has the highest ‘probability’. We cannot define ‘probability’ without first defining ‘expect’, and we cannot define ‘expect’ without first defining ‘probability’ or one of its synonyms.

We could try to escape by saying that what we ‘expect’ is what we think will happen, only that would be wrong. The word ‘will’ is too definite here, implying certainty. When we say we expect a die will roll a number less than five, we are not saying that we are certain that will be the case. If it were, and we rolled the die one hundred times in succession, we would have that expectation before each roll, so we would be certain that no fives or sixes occurred in the hundred rolls. Yet the probability of getting no fives or sixes in a hundred rolls is about two in a billion billion, which is not very likely at all. We could dispense with the ‘certainty’ and instead say that we think a one, two, three or four is the ‘most probable’ outcome for the next roll. But then we’re back in the vicious circle, as we need to know what ‘probable’ means.

  1. What does ‘expected’ mean?

There is a formal mathematical definition of probability, that removes all vagueness from a mathematical point of view, and enables us to get on with any calculation, however complex. Essentially it says that ‘probability’ is any scheme for assigning a number between 0 and 1 to every imaginable outcome (note how I carefully avoid using the word ‘possible’ here), in such a way that the sum of the numbers for all the different imaginable outcomes is 1.

But that definition tells us nothing about how we assign numbers to outcomes. It would be just as valid to assign 0.9 to heads and 0.1 to tails as it would to assign 0.5 to both of them. Indeed, advanced probability of the kind used in pricing financial instruments involves using more than one different scheme at the same time, which assign different numbers (probabilities) to the same outcome.

This brings us no closer to understanding why we assign 0.5 to heads.

Another approach is to say that we divide up the set of all potential outcomes as finely as we can, so that every outcome is equally likely. Then if the number of ‘equally likely’ outcomes is N, we assign the probability 1/N to each one.

That seems great until we ask what ‘equally likely’ means, and then realise (with a sickening thud) that ‘equally likely’ means ‘has the same probability as’, which means we’re stuck in a circular definition again.

  1. What does ‘equally likely’ mean?

After much running around in metaphorical circles, I have come to the tentative conclusion that ‘likely’ is a concept that is fundamental to how we interpret the world, so fundamental that it transcends language. It cannot be defined. There are other words like this, but not many. Most words are defined in terms of other words, but in order to avoid the whole system becoming circular, there must be some words that are taken as understood without definition – language has to start somewhere. Other examples might be ‘feel’, ‘think’ and ‘happy’. We assume that others know what is meant by each of these words, or a synonym thereof, and if they don’t then communication is simply impossible on any subject that touches on the concept.

Or perhaps ‘likely’ and ‘expect’ may be best related to a (perhaps) more fundamental concept, which is that of ‘confidence’, and its almost-antonym ‘surprise’. Something is ‘likely’ if we are confident – but not necessarily certain – that it will happen, which is that we would be somewhat surprised – but not necessarily dumbfounded – if it did not happen. I think the twin notions of confidence and surprise may be fundamental because even weeks-old babies seem to understand surprise. The game of peek-a-boo relies on it entirely.

Once we have these concepts, I think we may be able to bootstrap the entire probability project. The six imaginable dice roll numbers will be equally likely if we would be very surprised if out of six million rolls, any of the numbers occurred more than two million times, or not at all.

There are various frameworks for assigning probabilities to events that are discussed by philosophers thinking about probability. The most popular are

  • the Frequentist framework, which bases the probability of an event on the number of times it has been observed to occur in the past;
  • the Bayesian approach, which starts with an intuitively sensed prior probability, and then adjusts to take account of subsequent observations that using Bayes’ Law; and
  • the Symmetry approach, which argues that events that are similar to one another via some symmetry should have the same probability.

It would make this essay much too long to go into any of these in greater detail. But none of them lay out a complete method. I suspect they all have a role to play in how we intuitively sense probabilities of certain simple events. But I feel that there is still some fundamental, unanalysable concept of confidence vs surprise that is needed to cover the gaps left by the large vague areas in each framework.

Here is one last question to consider:

  1. A surgeon tells a parent that their three-year old daughter, who is in a coma with internal abdominal bleeding following a car accident, has a 98% chance of a successful outcome of the operation, with complete recovery of health. In the light of the above discussion, it seems that nobody can explain what that 98% means. Yet despite the lack of any explicable meaning, the parent is so relieved that they dissolve in tears. Why?

Andrew Kirk

Bondi Junction, January 2017


I don’t believe in reincarnation in the sense that I could be (unwittingly) the reincarnated soul of Marie Antoinette, but I think that there may be a germ of insight, perhaps even wisdom, in reincarnation myths.

There, I’ve said it. I’ve probably lost half my small readership right there. Let me try to explain, before I lose the other half. It’s not as bad as you think.

‘Here’s the thing’, as I am told young people say these days:

I am very taken by David Hume’s views on the self (as I am by many of Hume’s ideas). He was unable to find that he had any persistent self, no matter how hard he introspected (is that a word?). All he could find was ‘bundles of perceptions’. There is no perceptible separate watcher – a homunculus sitting in an armchair, as it were – watching those perceptions on a High Definition screen with SurroundSound. The perceptions just happen. And they are tied together – identifiable as the perceptions of David Hume – by occurring in the presence of the memories of the physical human body that bears that name.

There is a continuity to the stream of perceptions. They succeed one another, blend together and overlap. But that lasts only for as long as consciousness does. It is interrupted, usually at least once a day, by sleep, anaesthesia, concussion.

We say that we ‘return to consciousness’ but really it is not a return but rather a completely new stream of consciousness. The only connection to the previous one is that it occurs in association with the same human body, and hence that it has essentially the same set of memories.

We do not remember returning to consciousness. Or at least I don’t. Daniel Dennett explains this nicely in relation to peripheral vision. He says that we can’t perceive the boundary of our visual field (try it!) because to perceive a boundary we need to be able to see both sides of the boundary and, by definition, we can’t see the far side of the boundary of our visual field. Similarly, we cannot perceive the instant of regaining consciousness because to do so would require our being conscious of not being conscious immediately before waking up, and that is a contradiction. This only applies to dreamless sleep because when we wake from a dream we were conscious on both sides of the boundary, and we quickly realise that what went before was a dream.

So in a sense, the world is just full of streams of consciousness, each made up of a series of overlapping sensations and thoughts, with most streams lasting no longer than about sixteen hours. We can, if we wish, group those streams of consciousness based on the human body with which the stream is associated, but that grouping is fairly arbitrary. We could just as well have grouped them by the day on which they commenced, by length, or by mood.

Well, perhaps it’s not entirely arbitrary. Apart from memory and a shared body, there is one other thing tying a body’s streams of consciousness together, and that is that each stream cares very much about future streams that will be associated with that body. So Tom, as he goes to bed, cares more that tomorrow he has to wake up 15 minutes earlier to get to an 830 meeting at work than he does that Rajesh in Mumbai is going into hospital for a triple bypass operation, even though the stream of consciousness that is Tom-today is as distinct from Tom-tomorrow as it is from Rajesh-tomorrow. This chauvinistic, body-centric caring is easily explicable by evolution. Animals that cared about their future states of consciousness – particularly about whether the animal would be healthy and happy in future – survived better than animals that did not. We can’t fight it. That’s just the way our nervous systems are configured. But neither can we draw any metaphysical conclusions about the existence of some spooky continuous self or ‘soul’ from it.

If one is a Cartesian Dualist, one believes that there is a ‘soul’ attached to a body, that is non-physical – whatever that means. Although Dualism was the predominant metaphysical view for the last few millenia, it appears to be a minority view now. One can be an Immaterialist – denying the existence of matter and asserting that everything is mental, or one can be a Materialist – asserting that minds are just physical phenomena that we don’t properly understand yet. But either way, most people are Monists – meaning that they believe the world is basically only made of one fundamental kind of ‘stuff’. I feel quite fond of Dualism, if only because it is quaint, old-fashioned and a minority view – which is always attractive to me (which is why I’m typing this with a non-Microsoft word processor on a non-Microsoft, non-Apple operating system). But try as I might I just can’t believe it, so I’m afraid I’ll have to leave it aside and plough on with my Monist biases.

What about before we were conceived then? Nobody seems to feel any big deal about the fact that there are no streams of consciousness associated with their body before they were conceived. I wasn’t conscious then, so I wasn’t around to notice the fact that I wasn’t conscious. Nor can I identify my first conscious moment, probably because of the Dennettian boundary problem already mentioned. I suspect that ‘my’ body gradually attained consciousness, and gradually attained memory, over the first months or years of ‘its’ life.

I feel similarly about what will happen when this body dies. Since I don’t believe in a Christian, Islamic, Valhallian or Olympian after-life, I think that there will simply be no subsequent streams of consciousness associated with this body, and no streams of consciousness that share memories with streams of consciousness of this body. It’s Just As Well really, because after a few years, the body will have been gobbled up by worms and/or fish and/or bacteria and there will be no body left with which streams of consciousness could associate themselves.

And yet….

And yet…. there is something in being human that makes it almost impossible to comprehend that the consciousness of this body will cease forever. Perhaps it’s an evolutionary advantage to feel that, or maybe it’s just random. But it’s there, and I think that that feeling accounts for why nearly all cultures have developed some sort of after-life mythology.

Some deny the cessation by believing in an after-life – a continuation of the ‘same’ consciousness. It’s by no means obvious what ‘the same’ means here. My guess is that it means there will be future streams of consciousness that share memories with the body’s pre-death streams of consciousness. Some deny the cessation of consciousness, or at least mortality, by considering their children or grandchildren to be continuations of themselves. Others deny it by looking at their achievements – their legacy to the human race.

Here’s my answer:

After the death of this body, ‘I’ will still be conscious because every consciousness is an ‘I’. In other words, ‘my’ consciousness won’t cease because at any point in time, all those that are conscious will be conscious, and all those consciousnesses are ‘mine’ because every stream of consciousness is of a ‘me’.

‘My’ streams of consciousness don’t stop happening. All that stops is that there are no more streams of consciousness associated with this particular body, and this set of memories. So – and here’s the wibbly-woo, new-agey bit – ‘I’ become those other streams of consciousness, because they are all ‘I’. We were never really separate, it’s just that each individual stream of consciousness is locked in its own perspective for as long as it lasts – sixteen hours or so.

There’s all sorts of metaphors one could use for this, and they’re all wacky, but they have to be, since we are dealing with the indescribable. One I like is the idea of consciousness as some sort of fluid that is subject to conservation laws in the same way as energy, momentum, angular momentum, electric charge and matter. So whenever a stream of consciousness ends, because of sleep, death or whatever, the amount of consciousness it contains is released and flows into other streams. It’s a metaphor, alright (!?!), so don’t go reaching for those scientific instruments or ectoplasm-detectors or whatever they had in Ghostbusters to try to catch and measure this fluid.

Another metaphor is that in a sense ‘I’ am imprisoned in my own consciousness, unable to perceive what another perceives, no matter how close I am to them. When my stream of consciousness ends – usually around 11:15pm – ‘I’ am set free and can become someone else – another ‘I’. For some reason I visualise a bird – probably a dove (how twee) flying out from a cage whose door has been opened.

It is key to this perspective that consciousness is fungible, not hypothecated (after all what’s the earthly use of studying finance if you can’t insert technical financial terms at strategic points in a philosophical discourse, just to show off). In other words it’s like money. We can no more say that the consciousness from my stream of 29 May 2015 became that of Elton John on 30 May 2015 than we can say that my deposit in the bank paid for part of a particular customer’s home loan. That dismisses the possibility of my being Marie Antoinette right off the bat.

But just as all of a banks liabilities fund all of its assets, the consciousness that is liberated when I go to sleep tonight will replenish the consciousness of all streams that are going at that time. So I am connected to Marie Antoinette not because her consciousness – as a discrete entity – became specifically mine (with many other users in the 200 years between), but because we all share in the same cosmic pool of consciousness, that is particular to no body, and is drawn upon and supplemented billions of times per day as streams commence and end, be it by sleep, waking, death, birth, fainting, or other cause.

In that sense, ‘I’ am Mahatma Gandhi, John Lennon, Paul McCartney, Elvis Presley, Adolf Hitler, Charles Manson, Florence Nightingale, Elie Wiesel, Hypatia of Alexandria, Lucretia Borgia, George Best, Babe Ruth, Don Bradman, Peter Paul Rubens, Ludwig van Beethoven, Albert Einstein, John Cleese, Graham Chapman and, more importantly – many billions of other less famous people – clever, challenged, creative, dull, kind, cruel, indifferent, confident, shy. ‘I’ may be dogs and bandicoots and other animals too. But that’s the subject of another essay.

Arguably, a problem with this perspective is that consciousness will not persist indefinitely – at least not in this universe. We can be pretty confident that when the universe finally approaches heat death, no life will remain. So where does the consciousness go then? Well, that’s where the whole idea being a metaphor comes in handy. One great thing about metaphors is that you can drop them when something doesn’t fit, and pick them up again a little later. No metaphor fits every situation, because if it did, it wouldn’t be a metaphor (it would be the thing itself). So we drop it and think of something else, just as Shakespeare did when he realised that seas don’t generally fire arrows at you. Oh, no wait….

But why bother with a metaphor at all?

One might object that it’s silly to use a metaphor to orient oneself towards experience, especially when one knows that the metaphor will fail in some instances. My response to that is that every single one of our beliefs is a metaphor, and fails in some instances.

I tell myself I am sitting on a stool in front of a table to type this. The stool is solid and brown and the table is solid and purple. Yet that’s all metaphor too. The atomic theory tells us that what I’m sitting on is mostly empty space, and has no intrinsic colour. It has no integrity either, as it is constantly exchanging particles with its surroundings. But that too is only a metaphor, as quantum mechanics casts doubt on the whole notion of persistent particles, and who knows what even weirder theory will replace quantum mechanics and reveal it to be the crude metaphor that it undoubtedly is. It’s turtles all the way down, and there’s no reason to suppose that there’s a bottom.

Metaphors are neither true nor false, but they can be useful. We are story-telling animals, and stories – aka Metaphors – are the only way we can make any sense of life. They give it a shape that we can handle. Quantum mechanics is a useful metaphor if we want to make a laser (but not if we want to explain a black hole), and my metaphorical idea of this stool is useful if I want to have the experience that I call ‘sitting down’. So my metaphor of consciousness as a shared, universal, substance is useful to me if I want to think about inconceivable issues such as the non-existence of a persistent self, the lack of any conscious processes of this body before it was conceived and after it dies, and the relationship of all we people, and other animals, to one another.

Metaphors are also sometimes called myths, and they are just as good when they have that name.

Is this all just avoidance?

I can’t help pre-empting criticisms. It’s a vicious habit I picked up, I don’t know when but a long time ago. The wisdom of the ages says don’t bother, because it makes one’s writing longer, more complex, disjointed, ugly and harder to read. And critics rarely pay attention to one’s pre-emptions anyway. I can write “most dogs have fur that cause allergies to some people, but poodles don’t”, and some eager person will still sometimes respond “aha, but what about poodles? Got you there!”.

But since, like many people, I am my own worst critic I can’t help the odd pre-emption (of my own self-criticism), so I’ll allow myself one (or is it two? Did I already do one? We addicts are hopeless). Here it is.

Isn’t this all just some pathetic attempt to rationalise one’s way out of a fear of death by postulating some ridiculous Universal Consciousness? Why not just admit that when a body dies, it has no more conscious experiences, and that’s that?

Well Andrew (I reply), I’m glad you asked that question. Firstly I’d just like to observe that I did already say that (I believe) a dead body has no more conscious experiences, and there will be no more conscious experiences that have any memories of experiences that the body had. So this myth/metaphor doesn’t seek to deny or avoid that.

Nor is the myth relevant to fear of death, at least not for me. I used to fear death when I believed in a personal after-life, because I feared the punishments that had been threatened in that after-life if I didn’t conform to the strict expectations laid out in a rather large book of unrealistic rules. In fact I even feared the alternative of being ‘rewarded’ with eternal happiness, because I was convinced that no matter what treats and delights that reward comprised, I would be excruciatingly and agonisingly bored within a few billion years. But once I ceased to believe in an after-life, I ceased to believe in the possibility of such punishments, and hence I ceased to fear death. That is different of course from the fear of how one gets there (‘dying’), as I imagine that being squashed under the wheels of a Land Rover or being eaten by enraged Koalas is rather uncomfortable, albeit only for a short while.

No, the purpose of the myth, as far as I understand it, is twofold: first to escape the niggardly narrowness of the first-person perspective that is imposed on us by our bodily structure; second to open up possibilities for contemplating the mystery of consciousness, a phenomenon that no amount of scientific investigation seems ever likely to be able to explain. Given how mysterious and indefinable consciousness is (as opposed to mere brain activity that interprets sensory data, processes information and generates physical actions including speech), how unnecessary to the evolutionary account of the human brain it is, and how we (ie David Hume and I) are unable to detect any subject (‘self’) of this consciousness, it appears less ridiculous to me to regard consciousness as something primal, something universal that transcends individual bodies, than as an inexplicable phenomenon that arises in association with lumps of meat that are configured in just the right way.

Does that sound like a Humph! ? It wasn’t meant to. Ah well, if it is so, let it be so.

Marie Antoinette, 16 October 1793.


Metaphysics as a creative craft

In my writings I have not infrequently been dismissive of metaphysics, arguing that most metaphysical claims are meaningless, unfalsifiable, and of no consequence to people’s lives (leaving aside the unfortunate historical fact that many people have been burned at the stake for believing metaphysical claims that others disliked).

Perhaps it is time to relent a little – to give the metaphysicians a little praise. At least I will try. The basis for this attempt is a re-framing of what metaphysics is about. Instead of thinking of it as a quasi-scientific activity of trying to work out ‘what the world is like’, perhaps we could instead think of it as a creative, artistic activity, of inventing new ways of thinking and feeling about the world. Metaphysics as a craft, as delightful and uncontentious as quilting.

Why would anybody want to do that? Well I can think of a couple of reasons, and here they are (except that, like the chief weapons of Python’s Spanish Inquisitor, the number of reasons may turn out to be either more or less than two).

We know that there is a very wide range of human temperaments, longings, fears and attachments. A perspective that is inspiring to one person may be terrifying to another, and morbidly depressing to a third. For instance some people long to believe in a personal God that oversees the universe, and would feel their life to be empty and meaningless without it. Others regard the idea with horror. Some people are very attached to the idea that matter – atoms, quarks and the like – really, truly exists rather than just being a conceptual model we use to make sense of our experiences. Philosophical Idealists (more accurately referred to as Immaterialists) have no emotional need for such beliefs, and accordingly deny the existence of matter, saying that only minds and ideas are real. Indeed some, such as George Berkeley, regard belief in matter as tantamount to heresy, which is why the subtitle of his tract ‘Three dialogues between Hylas and Philonous‘, which promoted his Immaterialist hypothesis, was ‘In opposition to sceptics and atheists‘.

So the wider the range of available metaphysical hypotheses, the more chance that any given person will be able to find one that satisfies her, and hence be able to live a life of satisfaction, free of existential terror. Unless of course what they really long for is existential terror, in which case Kierkegaard may have a metaphysical hypothesis that they would love.

One might wonder – ‘why do we need metaphysical hypotheses, when we have science?‘ The plain answer to this is ‘we don’t‘. But although we do not need them, it is human nature to seek out and adopt them. That’s because, correctly considered, science tells us not ‘the way the world is‘, but rather, what we may expect from the world. A scientific theory is a model that enables us to make predictions about what we will experience in the future – for instance whether we will feel the warmth of the sun tomorrow, and whether if we drop an apple we will see it fall. Scientific theories may seem to say that the world is made of quarks, or spacetime, or wave functions, but they actually say no such thing. What they say is, if you imagine a system that behaves according to the following rules – which might be rules about subatomic particles like quarks – and you observe certain phenomena (such as my letting go of the apple), then the behaviour of that imaginary system can guide you as to what you will see next (such as the apple falling to the ground).

It’s just as well that scientific theories say nothing about ‘the way the world is’, because they get discarded every few decades and replaced by new ones. The system described by the new theory may be completely different from that described by the previous one. For instance the new one may be all about waves while the previous one was all about tiny particles like billiard balls (electrons, protons and neutrons in the Rutherford model of the atom). But most of the predictions of the two theories will be identical. Indeed, if the old theory was a good one, it will only be in very unusual conditions that it makes different predictions from those of the new theory (eg if the things being considered are very small, very heavy or very fast). So by recognising that scientific theories are descriptions of imaginary systems that allow us to make predictions, rather than statements about the way the world is, we get much greater continuity in our understanding of the world, because not much changes when a theory is replaced.

I think of metaphysics as the activity of constructing models of the world (‘worldviews’) that contain more detail and structure than there is in the models of science. We do not need the more detailed models of metaphysics for our everyday life. Science gives us everything we need to survive. But, being naturally curious creatures, we tend to want to know what lies behind the observations we make, including the observations of scientific ‘laws’. So we speculate – that the world is made of atoms like billiard balls, or strings, or (mem’)branes, or a wave function, or a squishy-wishy four-dimensional block of ‘spacetime’, or quantum foam, or ideas, or noumena, or angels, demons, djinn and deities. This speculation leads to different mental models of the world.

So metaphysics adds additional detail to our picture of the world. Some suggest that it also adds an answer to the ‘why?’ question that science ignores (focusing only on ‘how?’). I reject that suggestion. As anybody knows that has ever as a child tried to rile a parent with the ‘but why?’ game, and as anybody that has been thus riled by a child knows, any explanation at all can be questioned with a ‘but why?’ question. No matter how many layers of complexity we add to our model, each layer explaining the layer above it, we can always ask about the lowest layer – ‘but why?’ Whether that last layer is God, or quarks, or strings, or the Great Green Arkleseizure, or even Max Tegmark’s Mathematical Universe, one can still demand an explanation of that layer. By the way, my favourite answers to the ‘But why?’ question are (1) Just because, (2) Nobody knows and (3) Why not? They’re all equally valid but I like (3) the best.

Some of these mental models have strong emotional significance, despite having no physical significance. For instance strong solipsism – the belief that I am the only conscious being – tends to frighten people and make them feel lonely. So most people, including me, reject it, even though it is perfectly consistent with science. Some people get great comfort from metaphysical models containing a god. Others find metaphysical models without gods much more pleasant.

So I would say that metaphysics, while physically unnecessary, is something that most people cannot help doing to some extent, and that people often develop emotional attachments to particular metaphysical models.

Good metaphysics is a creative activity. It is the craft of inventing new models. The more models there are, the more people have to choose from. Since there are such great psychological and emotional differences between people, one needs a great variety of models if everybody that wants a model is to be able to find a model with which they can be comfortable.

Bad metaphysics (of which there is a great deal in the world of philosophy) is trying to prove that one’s model is the correct one. I call this bad because there is no reason to believe that there is such a thing as ‘the correct model’ and even if there was one, we’d have no way of finding out what it is. There can be ‘wrong’ models, in the sense that most people would consider a model wrong if it is logically inconsistent (ie generates contradictions). But there are a myriad of non-contradictory models, so there is no evidence that there is such a thing as ‘the right model’. Unfortunately, it appears that most published metaphysics is of this sort, rather than the good stuff.

It’s worth noting that speculative science is also metaphysics. By ‘speculative science’ I mean activities like string theory or interpretations of quantum mechanics. I favour Karl Popper’s test for whether a model is (non-speculative) science, which is whether it can make predictions that will falsify the model if they do not come true. A model that is metaphysical can move into the domain of science if somebody invents a way of using it to make falsifiable predictions. Metaphysical models have done this in the past. A famous example is the ‘luminiferous aether’ theory, which was finally tested and falsified in the Michelson-Morley experiment of 1887. Maybe one day string theorists will be able to develop some falsifiable predictions from the over-arching string theory modeli that will move it from the realm of metaphysics to either accepted (if the prediction succeeds) or discarded (if the prediction fails) science. However some metaphysical models seem unlikely to ever become science, as one cannot imagine how they could ever be tested. The debate of Idealism vs Materialism (George Berkeley vs GE Moore) is an example of this.

So I hereby give my applause to (some) metaphysicians. Some people look at philosophy and say it has failed because it has not whittled down worldviews to a single accepted possibility. They say that after three millenia it still has not ‘reached a conclusion’ about which is the correct worldview. I ask ‘why do you desire a conclusion?‘ My contrary position is to regard the proliferation of possibilities, the generation of countless new worldviews, as the true value of metaphysics. The more worldviews the better. Philosophy academics working in metaphysics should have their performance assessed based not on papers published but on how many new worldviews they have invented, and how evocatively they have described them to a thirsty and variety-seeking public. Theologians could get in on the act too, and some of the good ones (a minority) do. Rather than trotting out dreary, flawed proofs of the existence of God. the historicity of the resurrection, or why God really does get very cross if consenting grown-ups play with one another’s private parts, they could be generating creative, inspiring narratives about what God might be like and what our relationship to the God might be. They could manufacture a panoply of God mythologies, one to appeal to every single, unique one of us seven billion citizens of this planet. Some of us prefer a metaphysical worldview without a God, but that’s OK, because if the philosopher metaphysicians do their job properly, there will be millions of those to choose from as well. Nihilists can abstain from all worldviews, and flibbertigibbets like me can hop promiscuously from one worldview to another as the mood takes them.

We need more creative, nutty, imaginative, inspiring metaphysicians like Nietzsche, Sartre, Simone Weil and Soren Kierkegaard, not more dry, dogmatic dons that seek to evangelise their own pet worldview to the point of its becoming as ubiquitous as soccer.

Andrew Kirk

Bondi Junction, January 2015

i. Not just a prediction of one of the thousands of sub-models. Falsifying a sub-model of string theory is useless, as there will always be thousands more candidates.

What is a cause? Trying to distil clarity from a very muddy concept


Scene 1: So! – shrieked the evil monocled Gestapo officer. Eef you do not tell me ze name off ze leader off your resistance group, I vill shoot zis prisoner. Make your choice keffully! Do you vish to be ze cause off ze death off zis poor eenocent civilian?

Fade to scene 2: And now m’lud, intoned the imposing barrister, as you have just heard, if the defendant had correctly diagnosed the plaintiff’s stomach pain as a torsion of the testicle rather than prescribing antacid tablets, the testicle could have been saved by a simple operation that would have enabled the plaintiff to live the happy, fulfilled sexual life that he so richly deserves. I ask the court to award damages of five million dollars against the defendant for causing this poor man’s loss of sexual function.

Fade to scene 3: Have you found out why my car won’t start asked Jedediah. Well, I’m not sure, mister, said the mechanic, with a sarcastic look on her face, but it might have something to do with this snake that’s gotten its tail wedged in your starter motor. Mind your hands there, it looks a bit annoyed. Well golly, said Jedediah, who’d’ve thought that a little ol’ critter like that could cause so much trouble?

Three stories, three problems, three causes. Or are they?

If our heroine refuses to name the resistance leader to the Gestapo, will she have caused the civilian’s death? Or will the Gestapo officer have caused it? Or both? Or something else?

Did the doctor really cause the loss of the plaintiff’s testicle, or was it the fact that it managed to twist so as to strangulate the blood supply, or perhaps it was the plaintiff’s genes that gave them a particular anatomy that made them vulnerable to such an occurrence? If the latter then were the plaintiff’s parents the cause of the loss, or should we perhaps blame the person that introduced the parents to one another?

And was the snake really the cause of Jedediah’s car problems, or was it that he’d parked his car in the bush while camping overnight, providing an enticing warm place for any passing snakes to nestle in the warm engine?

Defining a ’cause’

The idea of cause and effect is an ingrained part of our language. We all feel that we know what the terms mean. But do we really? The above examples show how it’s not usually possible to point to one thing and say that is the cause of this. We might feel however that, with more care and thought, we will be able to precisely describe what really caused any given event.

The amazing answer is that No, actually we can’t. There is no such thing as a single cause of an event in the way it is traditionally thought of. The purpose of this essay is to examine the idea of cause (and effect) and work out what, if any, meaning we can give to this vague and rubbery, yet ubiquitous concept.

Is a cause necessary? Is it sufficient?

A natural place to start looking for a meaning seems to be to ask whether a cause is a necessary or sufficient condition, or both, for its effect to occur.

None of the suggested causes in the preface are necessary conditions. There are plenty of other ways the civilian could have died, the testicle been lost or the car failed to start. So we can dismiss necessity as a feature of causes straight away.

What about sufficiency? Neither of the suggested causes in the first two stories in the preface are sufficient conditions. The prisoner could have refused to snitch but the Gestapo officer relented and didn’t shoot the civilian. The undiagnosed twisted testicle could have untwisted by itself, or another doctor passing five minutes after the defendant misdiagnosed it could have had a look and diagnosed it correctly. The snake is another story though. Having a snake’s tail wedged in your starter motor effectively guarantees that your car will not start. So perhaps some causes are sufficient conditions for their claimed effects. We’ll come back to that later.

Cause as a difference between alternative prior scenarios

If I go to the dentist and ask why my lower right incisor aches, she may find decay in it and say “the cause of your ache is decay in the tooth”. The decay is neither a necessary nor a sufficient condition for the ache. The ache could be psychosomatic with no decay, or there could be decay but a dead nerve, in which case I’d feel no ache.

Yet I know what she means. So what is it that I, and any other dental patient, understands from the dentist’s statement?

I think it is that the situation I am experiencing while sitting in the dentist’s chair, call it situation S1, may be compared with another situation S2, that is identical to S1 in every respect except that there is no decay in the tooth. In neither case do I suffer psychosomatic hallucinations, nor is the tooth’s nerve dead. The only physical differences between the two situations is the decay. If a message takes a nanosecond to travel along a nerve from the tooth to my brain then in the situations one nanosecond later than S1 and S2, call them S1a and S2a, S1a will have me experiencing toothache and S2a will not.

Now the dentist has not explicitly mentioned an alternative situation, but that’s because it’s implied. I naturally interpret her statement as meaning “According to my observations and the biology they taught me at dental school, the key difference, in the toothy-brainy part of your body, between you and somebody very like you that does not have a toothache is that you have decay and they do not”.

We can formalise this idea of a cause with a precise definition:


    1. S1 and S2 are descriptions of alternative possible states of a system at time t, and
    2. the difference between S1 and S2 is C, and
    3. theory T requires that event E occurs at time t+dt if the system state at time t is S1, and
    4. theory T requires that event E does not occur at time t+dt if the system state at time t is S2,

then C is the cause of E in system state S1 with respect to system state S2, according to theory T.’

Note that lines 3 and 4 use the concept of sufficiency, raised in the previous section. S1 is sufficient reason for E to occur and S2 is sufficient reason for E to not occur.

People rarely, if ever, refer to two alternative system states when saying something is a cause. Usually, as with the dentist, the natural choice for S2 is evident and need not be stated. But it is useful to remember that there is nearly always an implied comparison state S2 when we talk about causes. Whenever controversial or confusing claims are made about causality, as happens so often in litigation, politics and philosophy in particular, it can help enormously if we analyse the claim by trying to identify what the implied comparison state is.

Do we really need to say ‘according to theory T’?

The appendage to the definition – ‘according to theory T’ – might seem superfluous and annoying to some. After all, people don’t usually quote a theory when they say that pricking the balloon with a needle caused it to burst. Nevertheless, just like the comparison state, a theory is always there. In the case of the balloon, the theory is Physics, as taught at modern universities. Training in Physics up to third-year university would provide all the understanding needed to explain the pop of the balloon.

Looking at the dentist example, we see that our interpretation of her diagnosis does include reference to a theory, viz: ‘according to … the biology they taught me at dental school’.

Now we might imagine that both Physics and Biology are just parts of a Grand Theory of Everything, of which science has so far only discovered a portion. If that were so, then we could leave off the appendage to our description of a cause, and just imply that the theory we mean is the Grand Theory of Everything.

But although some might find the Grand Theory of Everything a nice idea, and wish there really were one out there, we have no reason to suppose there is. I discuss this further in my essay ‘Some random thoughts on whether the world is random’. The conclusion is that, unless we are prepared to regard an enormous list of everything that ever happens in the universe as a theory of everything (which most people wouldn’t) there is no way to decide what sort of a collection of statements could qualify as such a theory. Is there a word limit? Does the collection have to be finite? Does it have to be expressible in English? Does it have to be comprehensible by an intelligent human?

In addition, as I argue in my essay ‘Replacing Truth with Reason’, there may not even be any ultimate description of the universe. Our scientific advances may lead to increasingly more complicated theories that, while intriguing, exciting and pragmatically useful, never converge to a final, stable, ultimate theory. Perhaps the universe is too complicated to be described by any theory.

So we will have to put up with the appendage for the time being. Devout Platonists may wish to assume that there is a Grand Theory of Everything, and omit the appendage, implying that T is that Grand Theory. But that is an act of faith that I do not feel inclined to emulate.

It does however seem reasonable to omit the appendage when conversing in the vernacular, if our implication is understood to be not that T is the Grand Theory of Everything, but that it is “Science as taught at universities, in the year in which we are speaking”. I will call this Science 2013, as that is the year in which I am writing. This ties the use of ‘cause’ to a sense of what the best scientists in the world currently understand about how the world works, and that seems to me to pretty accurately reflect how the person in the street would understand the term ‘cause’.

When discoursing philosophically though, as in this essay, it will be wise to retain the appendage specifying the reference theory, in order to be clear.

Can we define cause without a comparison state?

Some scenarios in which we might like to talk of causes do not naturally suggest comparison states. We might for instance consider the Cosmic Microwave Background Radiation (CMBR) that suffuses the sky, which is left over from the ‘last scattering surface’ of the Big Bang. We want to say that the Big Bang caused the CMBR. But we are stymied by the fact that we cannot think of an alternative situation with no CMBR. That situation would have to have no Big Bang, and hence possibly no spacetime, and hence no place in which to observe the lack of CMBR.

Here is an alternative definition of ‘cause’ that solves that problem.

‘If S is a description of a physical system at time t and theory T requires that event E occurs at time t+dt if the system was in state S at time t, then we say that S is the cause of E in system state S, according to theory T.’

In most situations this definition will be useless, because it requires a full description of the system state at the prior time. In order for E to be inevitable, that will have to be something like the location, momentum, type and spin of every particle within radius c.dt of the location of E (c is the speed of light) at time t. That is way too much information for everyday use. It’s a bit like saying ‘everything’ is the cause of E. But it may be useful to have this definition available as an alternative if we want to talk about causality in relation to situations that don’t have natural comparison scenarios.

In order to distinguish our two definitions of cause we’ll call the first one the Comparative Definition and the second one the Singular Definition. If we don’t specify, we’ll mean the Comparative definition because that’s likely to be most often the one we mean.

Looking back at the snake’s tail story, we can see that that meets the definition of a Singular cause of the engine not starting, if the tail is still wedged in the starter motor when the electric current unleashed by the ignition key hits the coils in the motor. If the time the current hits the coils is t, then we can say that the configuration of a spherical region of space with radius 10cm centred at the middle of the starter motor is the cause of the engine not commencing to fire at t+3.3×10-10 seconds, and that region includes the wedged snake’s tail.

A Singular cause is always sufficient for its effect, but the price we pay for that sufficiency is that the cause either has to be a complete description of the state of an enormous volume or, as is the case with the snake’s tail, the effect must occur a very tiny interval of time after the cause (a third of a nanosecond here).

Causes must be prior to their effects

The two definitions I have suggested require a cause to be earlier than its effect, which we call being ‘temporally prior’. Sometimes people talk of causes that are not temporally prior, so we should consider whether that can make sense. There are two common ways people do this.

‘Simultaneous’ causes

Some people give examples of what they think are physical causes that are simultaneous to their physical effects. They all turn out however, to be based on a misunderstanding of physics. There is a very simple reason why one physical event cannot cause another that happens at the same time, and that is the principle of relativity, which states that physical influences cannot travel faster than the speed of light. For event E1 at time t to affect event E2, also at time t, would require the influence of E1 to travel the distance between the two locations in no time at all, that is, at an infinite speed, which would break the speed limit and irritate the Great Cosmic Traffic Cop.

Examples offered of putative simultaneous causes are

  • a ball (cause) sitting on a pillow and causing a depression (effect), or
  • pushing one end of a lever down (cause) so the other end goes up (effect).

It is not the ball’s presence at time t that causes the depression in the pillow at time t, but the ball’s presence at earlier times. We can see this by imagining the ball suddenly magically pouffing out of existence. The pillow would not instantly regain shape. Rather it would start to spring back to its original, undepressed shape. If the ball were present on the pillow up to time t and instantly then disappeared, the shape of the pillow at time t would be exactly the same as if the ball were still there. The depression would gradually disappear as the pillow started to regain its usual shape after time t. In the real, non-Harry Potter world, change takes time.

Similarly, the footpath of a bridge does not stay up because its supporting beams are there, but because those beams were there an instant earlier.

When we push down one end of a lever, the other end does not instantly lift. Rather, a shock wave travels through the lever, deforming it in such a way that, a tiny instant of time later, the other end lifts. The shock wave travels at the speed of sound in the lever, which will be very fast indeed if it is made of a stiff substance like steel, but still much slower than light. Because the wave is so fast, we cannot perceive it without specialised equipment, so the effect seems instantaneous. If we had a fast enough camera, we might even be able to film the deformation of the lever as the shock-wave passes through. But we’d need an enormous enlargement of the frames to see the lever’s deformation in the film, because the shockwave of the initial push has probably reached the other end before the end we are pushing has moved a millimetre.

Readers who are familiar with the Quantum Mechanical phenomenon of entangled particles might hope for a loophole in the cosmic speed limit via the fact that, when one member of a pair of entangled particles is measured, the wave function collapses and the other member attains a definite value of the measured quantity.

This ‘spooky action at a distance’ as Einstein called it, does not however break the speed limit, because no physical influence is being transmitted. The wave function is simply a mathematical abstraction we use in Quantum Mechanics to make predictions and its collapse has no physical significance. In particular, there is no experiment we can do to find out whether the wave function of a particle has already collapsed. It will collapse when we make the measurement in the experiment, but that cannot tell us whether it had already collapsed before that.

So in summary, there is no escape from the cosmic speed limit, and hence there is no such thing as a simultaneous physical cause.

‘Logically prior’ causes

Another way people try to escape the need for temporal priority is to talk of a cause as something ‘non-physical’ that entails its effect via the laws of logic rather than of science. They could for instance say that the rules of arithmetic are the cause of 2+2 equalling 4, or that the fact that all men are mortal and Socrates is a man is the cause of Socrates being mortal.

This could be formalised by saying that if A→B where A and B are propositions and → denotes logical entailment (if the proposition before the arrow is true then the proposition after the arrow must be true) then A is the cause of B. Let’s call it a Logical Cause to distinguish it from the Comparative and Singular definitions of causes that we discussed above. In this context only, we will refer to causes meeting those definitions as ‘physical’ causes. Defining ‘physical’ is usually a controversial mess. But here all we mean by ‘physical cause’ is a cause that satisfies our Comparative or Singular Definitions.

There’s nothing incoherent about defining logical causes this way. No contradictions or ambiguities arise. The trouble is just that it’s a completely different use of the term cause from how it is used in relation to everyday physical things, so one cannot apply any conclusions drawn about physical causes to logical causes, or vice versa.

Further, there is already a perfectly good word in use within the field of symbolic logic for a logical cause. It’s called an antecedent. And the thing coming after the arrow is called a consequent.

So all we achieve by using ‘cause’ in this context is confusion, by applying a word that has a meaning in a different, completely unrelated field (the physical) to a concept that already has a perfectly clear label in this field.

Readers should beware of arguments that try to use logical causes. Such arguments might use words like ‘now consider causes that are logically prior rather than temporally prior to their effects’. The only reason I can think of to use the word ‘cause’ for a logical antecedent is to try to smuggle in some of the properties of physical causes and apply them to logical causes, without the validity of doing that being challenged. As logical and physical causes have no relation to one another, other than in a vague, touchy-feely sort of way, it is invalid to apply any properties of physical causes to logical causes.

Sorting out which event is the cause and which is the effect

Another problem of not requiring causes to be temporally prior is that it creates ambiguity as to which of the two events is the cause and which is the effect. In the physical case, this is clearly resolved by requiring a cause to be earlier than its effect. We lose that capacity if we don’t require temporal priority.

In the logical case, if we have A→B but not B→A then we can say, if we wish, that A is a Logical Cause and B is its logical effect. But if we have both A→B and B→A then there is no basis for saying one of A and B is the cause and the other is the effect. We will see in the next section how this can lead to grief.

Examples of the use of our definitions

Let’s try out our two definitions – Comparative Cause and Singular Cause – in a few situations where the word ‘cause’ is key to the thinking processes, to see how they fare.

Causation in philosophy

More than 2000 years ago Aristotle thought and wrote about causation, in a way that has been adopted by many philosophers since then. He listed four types of cause, of which only one, the Efficient Cause, is close to the way the term is typically used now. Unfortunately, even the notion of an Efficient Cause is bound up with Aristotle’s ideas about physics which, being pre-Newtonian, are incompatible with the way we now understand the world to work.

Nevertheless, philosophers still blithely make arguments using the word ‘cause’, only rarely pausing to consider what if anything the word actually means, and whether it really belongs in their arguments. A notable exception is Bertrand Russell in his marvellous 1912 essay ‘On the notion of cause’.

Here are a couple of examples of how ‘cause’ is used in philosophical arguments, and how we can use the considerations above to understand them better.

First Cause arguments for the existence of God

There is a very old and venerable argument that there must be a being (God) that is the cause of the universe’s existence. There are a number of versions, including a popular one that has been revived recently, based on a medieval Islamic argument from the Kalam school. All versions of the argument rely on God being a Cause for the universe. An obstacle to all these arguments is that there can be no ‘before’ the universe, as time is itself a feature of the universe, not something that applies outside it. So there cannot be a cause that temporally precedes the universe. Devotees of the First Cause argument sometimes respond that God is logically prior, rather than temporally prior to the universe. That is, God→Universe.

There are two problems with this argument.

Firstly it relies on a premise that every object of a certain type must have a cause. It tries to generate support for that premise by appealing to our experience, and all the examples used are of physical causes. Hence the premise is restricted to physical causes and tells us nothing about non-physical causes, which is what the argument wishes to argue God is. This is a smuggling attempt, of the kind discussed above.

Secondly, what the argument actually does is to reason from the existence of the universe to the existence of God. That is, Universe→God.

But now we have a situation that is logically symmetrical between God and the Universe, which a logician would denote as God↔Universe. Each implies the other, so we cannot say that one is logically prior. One might be tempted to say that there was a time, before the creation of the universe, when there was only God and no Universe, which makes God prior, and hence the cause. But that route is forbidden because it relies on the existence of time, which is part of the Universe.

So the philosopher that pursues this route is committed to saying that, if there is a God, then it is caused by the Universe as much as it causes the Universe.

Such a conclusion is likely to satisfy neither theist nor atheist, and demonstrates quite nicely the futility of trying to reason about causes that do not temporally precede their effects.

The Epiphenomenal hypothesis of consciousness

Epiphenomenalism is a hypothesis that says mental events (consciousness) are caused by physical events in the brain, but have no effects upon any physical events. In other words, brain activity causes consciousness, but consciousness does not cause any brain activity.

For this to be the case, given our definition of cause, a mental event must occur after the physical (presumably brain) event to which it relates. Hence the brain event can be a cause of the mental event, but not vice versa.

Importantly, if the mental event occurs simultaneously with the related brain event then we cannot say that either causes the other, because neither precedes the other. This is a crucial observation because sometimes people talk about Epiphenomenalism as if it is a simultaneous occurrence caused by the contemporary brain activity. However, as we have seen above, for simultaneous events there is no way to identify which is cause and which is effect. So a mind-body model that involves simultaneous processes is not Epiphenomenalism.

Causation in Science

Does all science rest on the assumption that everything has a cause? It might seem so, and this claim is often made, but it’s wrong. Science doesn’t need everything to have a cause, to be useful. Science rests on the observation that there are patterns in nature, such that systems appear to evolve in regular, repeatable ways that can be described by natural laws. If we can discover such a law, by inventing theories based on experimentation, and then testing the theory’s predictions using further experiments, then we may be able to predict future events, and shape the course of those events.

So science is best described not as a search for causes, but as a search for laws that describe how physical systems evolve.

We don’t even need to believe that everything is governed by natural laws. For instance, some interpretations of Quantum Mechanics hold that there is no law determining the precise time at which a radioactive particle will decay. The apparent absence of a cause for that particular aspect of reality does not however prevent us from making very precise predictions about the behaviour of physical systems using Quantum Mechanics.

In science we don’t need to have causes for everything, or even to believe they exist. At most we need causes for the important features of the system we are evaluating.

Causation in Physics

Light cones

An important concept in physics is that of the light cone. For a given point P in spacetime, the past light cone is the set of all spacetime points from which a particle could have travelled prior to passing through P. There is also a future light cone, which is the set of all spacetime points that can be reached by a particle that first passes through P. The particles in question may be photons, which travel at the speed of light, or slower particles with mass, like electrons or cricket balls.

Physicists talk about two spacetime points as being ‘causally connected’ if one is in the other’s past light cone. This means that the later point can be affected by something that happens at the earlier point. Events at points that are not causally connected cannot affect one another. That is, changing what happens at one point will have no impact on the other. Such points are called space-like separated points.

For point P, the future light cone marks out the limits of the points P can causally influence, and the past light cone marks out the limits of what points can causally influence P. Hence the light cones are regarding as showing the limits of causality.

This usage harmonises with both our Comparative and Singular definitions. In the Singular definition, the cause (according to Science 2013) of an event E at spacetime point P, with time coordinate t, is the state of the set R of all points in P’s past light cone that have time coordinate t-h for some positive h. In the Comparative definition, if S1 and S2 are alternative possible states of R, such that E happens at P if R has state S1 but not if R has state S2, then the difference C between S1 and S2 is the cause of E in S1 with respect to S2, according to Science 2013.

It might seem that the light cone perspective adds an additional constraint to causality above the constraint in our definitions that causes must precede effects. For not only must the cause precede the effect, but it must also lie in the effect’s past light cone.

It turns out that, because of the theory of relativity, this is not an additional constraint at all. We can only say unambiguously that C precedes E if C is in E’s past light cone, because then the time of C will be earlier than that of E in every possible reference frame. If C is in E’s future light cone we can say unambiguously that E precedes C, so C cannot be a cause of E. That much is obvious. But if C is in neither the future nor the past light cone of E, it will be later than E in some reference frames and earlier than E in others. Einstein’s theory of relativity tells us that no reference frame is any more valid than any other, so C cannot be a cause of E if there is even just one reference frame in which it occurs after E (in fact if there is one such frame then there will be infinitely many).

This last consideration tells us that, if we ever discovered particles or other influences that could travel faster than light, it would destroy our notion of causality entirely. Because then we would have pairs of events that we thought were cause and effect, for instance the beginning and end of a path followed by one of these particles, but for which in some perfectly valid reference frames the effect preceded the cause. We would have to either jettison the notion of causality entirely, or develop a completely new one, that may only have very slight similarities to the existing one.

It is fortunate for us then that the superluminal neutrino speeds observed in experiments in 2011-12 turned out to be experimental errors.

Quantum indeterminacy

In both our definitions of Cause we say theory T ‘requires that’ the effect occurs after the cause. However quantum mechanics tells us that nothing is certain to happen. Things we think of as inevitable are really only very, very likely. How then can we meaningfully talk of an effect being required to occur after its cause?

One solution is to replace statements of certainty by probability statements. We could replace ‘theory T requires that’ by ‘under theory T there is a greater than 99.9% probability that’. Here T is of course Quantum Mechanics. If we make this substitution in the Comparative Definition (twice, for the two instances of ‘theory T requires’) and the Singular Definition (once) then these definitions are ship-shape and ready to be used in the Quantum Mechanical world.

We might wish to go further and call C a cause if the probability of E occurring after S1 is lower than 99.9%, say 50%, and the probability of E not occurring after S2 is still 99.9%. In that case C is a sort of enabling condition for E to occur, but it does not guarantee it. If we wanted to go down that route it would be better to give this type of relationship a slightly different name like ‘probabilistic cause’, to avoid confusion with the cases where C makes E almost certain to occur.

Correlation does not imply causation

A famous dictum that is often used in both science and social studies is ‘correlation does not imply causation’. Let’s put our Comparative Definition to the test to see if it supports this uncontroversial dictum. But because medical and social sciences are quite complex, we’ll use an example involving something simple instead – bowling alleys.

Imagine that a bowling alley has an easily depressed light switch placed in the middle of the alley, 20cm away from the central lead skittle. When depressed, the switch closes an electric circuit that illuminates a light above the skittles. After watching a few matches we notice that the light goes on for a fraction of a second and then off, immediately prior to every strike (knocking down all ten skittles). We have observed a correlation between illumination and strikes, and we wonder whether the light causes a strike.

First we compare two situations, describing the region R around the bowling alley, at the time a ball that has been bowled passes the switch. The situations S1 and S2 are identical except that in S1 the ball is on the switch and illuminating the light, while in S2 the ball is to the left of the switch, too far left for a strike to occur, and the light is not illuminated. The region R is large enough that nothing that is outside R when the ball passes the switch can change whether a strike occurs.

In S1, Science 2013 requires that a strike will shortly occur and in S2 it requires that a strike will not occur. So our definition of cause is satisfied. We can say that the difference between S1 and S2 caused the strike after S1. But what is the cause we have identified? It is everything in S1 that is different from S2.

That includes the light being on but it also includes the ball being in the middle of the lane. We could if we wish say that B caused the strike where B is ‘the light being on and the ball being in the middle of the lane’. The latter is consistent with what a lay person would think of as being the cause, so that’s a good start. It is reasonable to describe B as the cause. The bit about the light seems superfluous though. Can we get rid of it?

Yes we can, as follows. We add a new situation, S3, which is the same as S2 except that someone stands on the lane, avoiding the ball, and briefly depresses the light switch as the ball passes, if the ball does not itself roll over the switch. Now let’s compare S2 and S3. In both cases there is no strike. They are identical except for the man standing on the lane and the light being on. So it appears that the light being on is not a cause of a strike. The light illumination is correlated with, but not causative of, strikes.

This confirms that the Comparative Definition can, at least in this case, reproduce results that accord with our intuitions about causation.


We have developed a definition of cause – the Comparative Definition – that captures the everyday meaning of the term while removing ambiguity. The price of the additional accuracy was having to specify a comparison scenario S2 and a reference theory T.

For cases where a comparison scenario is not readily imaginable, we have an alternative definition – the Singular Definition – that still captures the commonly understood meaning. The price of this additional power is having to specify the prior scenario – the ‘cause’ – either over an enormous volume of space or a tiny period of time prior to the effect.

We have seen that an essential feature of any useful, unambiguous notion of cause is that it requires causes to precede effects in time. We observe that invocation of simultaneous causes or logical causes is usually a symptom of a flawed argument.

We have identified a way to generalise the notion of cause to handle the uncertainty that comes from Quantum Mechanics, by including probabilities in the description of a cause.

We have observed how these definitions of cause can be used in practice in a variety of fields of inquiry.

Finally, if we can take any ‘moral’ from this rather prolonged meditation, it is that in any argument that relies on notions of cause we should examine closely how the term ‘cause’ is used and what properties are ascribed to it in the argument. If this is not clearly set out, the argument may well have hidden flaws or, in some cases, be incoherent, no matter how plausible it may sound.

Andrew Kirk. Bondi Junction, 8 June 2013