I try hard to be open-minded. I think I succeed at that reasonably well, but I still regularly get surprised at the discovery of a prejudice I didn’t know I had.
I don’t know whether it’s possible to rid one’s self of all prejudice – I suspect it’s not. If so, the best I can aim for is to be on the alert for prejudices, try to rid myself of them when I discover them, and try to always remember that any opinion I have – regardless of how carefully thought out it may seem – may be inextricably tied up with some prejudice I don’t yet realise I have.
The Wikipedia article on Cognitive Biases has a very long list of them. With so many opportunities to go wrong, it’s hard to imagine one can escape all of them.
There’s a popular phrase ‘It’s good to have an open mind, but not so open that your brain falls out‘. I don’t like that phrase at all. It is most commonly used by bigots in an attempt to defend their bigotry while at the same time appearing rational. Nobody’s brain has ever fallen out from being open-minded, either literally or metaphorically. However, the rankest nonsense phrases often have a grain of truth of them, and there is a grain of truth even in that one. It is that, in order to achieve anything with our thoughts, we need a framework within which they can operate, and that framework will be made of rules and suppositions that are accepted without evidence. I agree that we need such a framework, but what is crucial is that we acknowledge the existence of the framework, that it has no supporting evidence, and that we hence have no basis on which to claim it is better than any other framework. That doesn’t mean we should refuse to act on conclusions drawn within our framework. But it does mean that (in my opinion, which was derived within my mental framework!) it is a good idea to regularly examine and challenge our framework, and consider alternatives. Sometimes that may lead to a radical change in worldview, which opens up whole new vistas.
It may lead to a Christian becoming a Buddhist, or vice versa. It may lead to a Socialist becoming a Libertarian, or vice versa. It may even (heaven forfend!) lead to personnel exchanges between Platonism and Existentialism. I may have my own preferences about which of those and other sets of ideas most people aligned themselves with but, regardless of the outcome of any migrations of beliefs, I see it as good that people regularly examine their beliefs, so that belief migration becomes a commonplace possibility. If we know what our prejudices are, we have the power to change them. But we cannot change a prejudice we don’t even know we have.
Here are two of my prejudices. The first is that it is preferable for there to be less suffering in the world. I know it’s a prejudice. I know I can’t prove it. But I’m going to hang onto it, for now at least.
The second prejudice is that if I have observed two phenomena to occur in close conjunction many, many times then, in the absence of strong reasons to the contrary, I should expect them to continue to occur in conjunction in future. Every self-supporting person on Earth has this prejudice. But nobody even realised it was a prejudice until David Hume pointed it out in the eighteenth century – his famous ‘Problem of Induction’. If you don’t believe me, think of how you use language. You speak English to somebody – say it’s Bertha, expecting them to understand it, because they have understood English when you spoke it to them in the past. But why should the fact that Bertha has always understood spoken English in the past indicate anything at all about whether she will understand it in the future? You might object that you know that Bertha learnt English as a child, so you know she knows English. But then you are relying on the association between the events ‘X has learned English‘ and ‘X understands English‘, which has been reliably observed in the past, but why should that tell us anything about whether it will be observed in the future? Whatever objection is raised, I (or rather David Hume) can find an answer to it. But I’m still going to hang on to this prejudice.
Prejudice in Music
I had been thinking over this in the context of musical styles. It’s hard to think of any other human activity, the study of whose history is so riddled with the use of the word ‘shocking‘. The most casual observer probably knows about how Rap was considered shocking when it emerged in the eighties, ditto Punk in the seventies, how Rock n Roll was considered shocking when it emerged in the fifties, and how Jazz was considered shocking in the early twentieth century.
But the history of people being shocked by music goes back much farther than that. The history of classical music in particular is regularly punctuated by shocks when some innovator broke hallowed rules. Working back in time we have Schoenberg, Stravinsky, Debussy, Wagner, Beethoven, Haydn and Monteverdi as major disruptors of established musical conventions.
The following story from a radio music presenter made a big impression on me. They told of how they had been working in the archives of a classical music operation, listening to, classifying and cataloguing recordings. After having been doing this for a few weeks they walked past a studio in where music was playing over the loudspeaker. Appalled at the terrible, disorganised racket they were hearing, they asked somebody what the noise was. It was JS Bach! [For the non-classical music buff, JS Bach was a genius who lived from 1685 to 1750, in the ‘Baroque’ period, and is as revered a part of the musical establishment as it is possible to be] The reason it sounded so terrible and formless was that the music the presenter had been listening to non-stop for the last few weeks was all pre-Baroque, and hence operated within a framework of rules and norms that Bach’s music ‘broke’. If they had heard it a few weeks earlier they would have likely thought ‘how lovely!‘ or maybe even ‘that’s a bit old-fashioned!‘
I want to pick on Schoenberg, because on the face of it he might seem to go as far as one can go in breaking rules. The Austrian composer Arnold Schoenberg rebelled against the tyranny of tunes having to be in a musical key, like G major or A minor. Although only classical pieces tend to state their key, with names like ‘String Quartet in E flat Major‘, nearly all pieces have one. Perhaps the most famous song of all, Lennon and McCartney’s ‘Yesterday‘, could have been called ‘Sad song in F major‘. Key changes do occur within a piece, but they have a big effect, because we become attached to the key in which the tune is set. That’s why key changes are often used towards the end of a song to build up the levels of excitement and energy towards a final climax.
Schoenberg’s project was to refuse to use any key at all, not even for one phrase at a time. To do that he invented his ‘Twelve-tone system’ in which he set a rule that every one of the twelve notes in a chromatic scale must be used exactly once within each short section of the piece (called a ‘tone row’). By giving every one of the twelve possible notes equal status, he prevented any note gaining prominence as the ‘Tonic’, the home note of a key. Unlike another famous Austrian, Schoenberg was very anti-racist: he wanted the black piano keys to get as much opportunity as the white piano keys in his pieces (note that’s a different use of the word ‘key’).
Here, from YouTube, is a Schoenberg piano piece using his twelve-tone system, for you to enjoy.
As I was musing over whether Schoenberg had achieved the ultimate in open-mindedness, I suddenly realised a hidden prejudice. Sure, he had proclaimed equality between all twelve notes. But a note is defined by a frequency – vibrations per second. So the number of possible notes is infinite, not only twelve per octave. Between any two different notes there are an infinite number of frequencies between them. In Western music, which is descended from Ancient Greek music, the smallest interval between two notes is a semitone, which means the ratio of the two frequencies is 21/12 or about 1.06.
In classical Indian music, the twelve Western notes are used, plus ten others, called ‘nadas’, giving twenty-two altogether, so that the average gap between adjacent permissible notes is just over a quarter-tone. A piece containing nadas sounds to a Western ear, which is not trained to understand those extra notes, like it is being performed on an out-of-tune instrument.
Here is a scale that goes up an octave in twenty-four quarter-tone steps (not exactly the same as an Indian scale, but closer to that than to a Western scale), then walks back down again. What does it sound like to you?
For comparison, here is the same thing using just the twelve notes in the Western scale.
Try to sing or hum along to first the Chromatic scale, and then the quarter tone scale. I’m a reasonably accurate singer and can do the first, but can’t even get started on the second.
But even writing music that includes nadas or quarter tones still involves a prejudice against the in-between notes. It’s just a smaller prejudice than Westerners like me have. I expect a piece involving eighth-tone intervals would sound just as weird to an Indian as one using quarter-tones does to us.
If we want to write music that is free from all prejudice, we need to go beyond Schoenberg, beyond Indian music, beyond even eighth tones, and write music in which each note can be any frequency at all, without limiting the choice to notes that are certain multiples and ratios of others.
I wrote a piece of such music. To be precise, I programmed a computer to randomly generate a series of frequencies and note-lengths and produce notes using those. I then produced another version of it, in which each note was rounded to the nearest semitone, so that only the twelve notes in the Western scale were used.
Can you tell which is which? They are both weird. Both break most of the rules we are used to. But one is a bit weirder, a bit more free, than the other.
Click here to find out which one is which.
The above is a long way from JS Bach, but is it free from any form of musical prejudice (aka structure)? No. For a start I have constrained the notes to be within the audible frequency range, even though it is entirely conceivable that notes that we cannot consciously hear may still have an effect on our body and thereby alter the sensory experience. I have also constrained the notes to not be very short or very long, in order not to frighten or bore the listener. The volume is also constant, rather than varying between notes, or even within notes. The shape of each sound wave is a perfect sine curve, whereas the wave shape could be allowed to change between and within notes too. That would not change the ‘tune’ but it would change the texture (‘timbre’ in musician-speak). I expect there are other prejudices in there that I have not yet realised.
I like most of my prejudices. I prefer Bach to Schoenberg most days of the week. But it’s good to challenge oneself every now and again with a bit of Schoenberg (or its equivalent), and then to occasionally challenge the Schoenberg with something even more radical.
Bondi Junction, July 2017
I was jogging on the beach, trying to think of something else because the last couple of days had been rather upsetting. I settled on thinking about an essay I am trying to write about The End of The World. Very soon I found that I had the REM song It’s the end of the world as we know it running through my head on repeat.
After a while I noticed somebody running along next to the concrete promenade, where the sand is softest because it is furthest from the water and almost never gets wet from the sea. The sand was pretty soft where I was, about halfway between the promenade and the water. But maybe it was softer over near that other guy. In any case, we’d had heaps of rain recently, so if water makes sand pack together harder, presumably where I was would be just as water-hardened as next to the promenade.
But then maybe seawater has a different effect. Perhaps it makes the sand stick together better than rainwater does. If so then the sand next to the promenade really would be softer, unless the sea ever gets up to there.
That led to me wondering about whether, in the wildest sorts of weather, the sea ever came all the way up to the concrete wall below the promenade (about fifty metres from the high tide mark).
Thinking of stormy weather made me think of the scene in the movie The French Lieutenant’s Woman where the female lead stands at the end of a long jetty in a storm, only a metre or two above the rough sea – a precarious position, deeply evocative.
That led me to wonder whether it is sexist to refer to the character as somebody’s ‘woman’, thereby seeming to suggest ownership. That led to my thinking about the reverse phrase ‘somebody’s man’, which led me to think of the Tammy Wynette song Stand by your man.
And without any conscious decision to do so, there I was, jogging along the beach, mentally humming Stand by your man instead of It’s the End of the World as we know it.
Bondi Junction, April 2017
Featured Image is from the 1981 movie The French Lieutenant’s Woman, showing the jetty called ‘The Cobb’ at Lyme Regis UK.
One day the sun will grow so large that it will first dessicate, then bake, then engulf and vaporise, the Earth and everything on it. No life will survive that. Perhaps some people will have escaped to habitable places in other solar systems, but it’s hard to imagine it would be many, given the enormous energy that is likely to be involved in any interstellar travel. I expect ordinary people will be unable to escape.
Even escapees will be wiped out eventually, as the universe, many billions of years from now, slides inexorably into heat death. No life will survive that.
So there it is: the end of the world is a matter of when, not if. We are powerless to prevent it.
That background makes it a bit confusing to work out what moral obligation we have to take actions that prevent a near-term end of the world, and to avoid actions that would hasten it.
If we are talking about preventing the end of the world in our lifetime, it can be easier to resolve, because that would affect people that are alive now, and most people recognise that they have at least some obligation of care to other people that cohabit the world with them.
But that obligation is less widely accepted when it comes to future generations, and the farther away those generations are, the fewer people tend to feel an obligation towards them. Politicians sometimes talk about intergenerational equity and caring for the future of our children, maybe even our grandchildren. But it’s a rare politician that argues for a policy on the basis of its effect on our great-great-great-great-great-great-great-great-children.
At some stage, life on Earth will come to an end, and it seems likely that that end, unless it occurs in the blink of an eye – which it is hard to imagine happening – will be accompanied by tremendous suffering. If that is inevitable then how can we work out whether it matters whether it occurs sooner or later?
We cannot solve this by reason alone. As David Hume so acutely observed “Tis not contrary to reason to prefer the destruction of the whole world to the scratching of my finger.” Of course he wasn’t saying he did prefer the destruction of the world. He was saying that one must look to one’s emotions to find an answer.
Looking to my own emotions, I confess that I am more alarmed at the prospect of the world ending in a catastrophe in 100 years than in 100 million years, despite the fact that I will not be here to see either..
I use the word ‘catastrophe’ rather than ‘cataclysm’ because I think the end would be lingering and painful. We should be so lucky as to be extinguished in the blink of an eye. Philosophers who have nothing better to do with their time make up thought experiments involving a button you could press to instantaneously end the world, and ask under what circumstances you would press it. But there are no such buttons, nor ever likely to be, so we need to contend with the end being a drawn-out, painful process. I suspect widespread famine would be a major part of it. That would lead to outbreaks of uncontrolled violence as people compete for the dwindling resources of food and water. Disease would spread to accompany the famine – perhaps providing a more merciful end for some. We see this sort of catastrophe already in some parts of the Earth, and we will see them more often as climate change becomes more severe.
Is a ‘soft landing’ possible? What if, realising that the world will become uninhabitable within 200 years, we were to decide that we were morally obliged to not have children, in order not to inflict on those new people the pain of experiencing the world’s slow death? What would a world with no new children be like? Most of us, including me, feel that it would be very sad. I know of two novels that explore this: ‘The Children of Men‘ by PD James, and ‘I who have not known men‘ by Jacqueline Harpmann. In the first, for some unknown reason, humans cease to be able to conceive. The novel is set about twenty-six years after the last baby was born. In the second novel, a group of female prisoners escape from their underground dungeon to find the Earth deserted. They wander for many years in vain search of other survivors and after a while start to die of old age, with no replacement.
Both novels are confronting, bleak and sad. The James also has a thriller element to it (which I won’t spoil for you), but the basic premise is still bleak.
It would be very hard for us now to decide ‘No more babies’. Imagine us all gradually dying one by one, deprived of that feeling of continuity – the circle of life – that one gets from seeing younger generations. But what if society had the time to work up to that over several generations? What if, realising that all life would cease within ten generations, society worked to change its culture in order to equip people to feel more positive about non-procreation and less reliant on younger generations. It would be a very difficult psychological shift to accomplish. It would have to counteract the powerful impulse embedded in our psyche by evolution – to perpetuate the species. But who knows what techniques of psychological manipulation humans may have managed to invent in a thousand or more years’ time? Maybe they could condition future humans to find fulfilment in bringing their species in for a soft landing – for instance in working as a childless carer for old people until one becomes too old to work. Things could be set up so that the last remaining people have all the food, water, clothes, medicine, shelter, power and entertainment they need to survive solo (we would also need to train people to be comfortable with isolation, which we current humans are definitely not). They might also be provided with pills to provide a painless end to life once they near the point where they will no longer be able to feed themselves. That is not how it happens in ‘The Children of Men‘. But that book is set in 2021, not 3021, and with no notice for society to prepare for the landing (for some reason fertility just suddenly ceases in 1994).
If a soft landing were possible then, while an end of the world may be inevitable, its accompaniment by great suffering would not be. It would then become easy to argue for doing what we can to delay the end of the world. It is simply to prevent a great suffering.
What if it’s not possible, so that the great suffering is simply a matter of ‘when’ rather than ‘if’? What if the amount of suffering accompanying the end of the world will be roughly the same regardless of whether it occurs in 200 years or 200 million years? Are we morally obliged to do what we can to defer it beyond 200 years? I pick 200 years by the way because that should be long enough to be fairly certain that nobody currently alive will be around to experience a world’s end in 200 years.
It seems to me that the main difference between the two end dates is all the currently-unconceived humans that would experience life in the intervening 199,999,800 years. Is it a good thing or a bad thing that such lives should come to pass? There is very little moral guidance on this. Even religions have little to say about this, with only a very few religions (albeit big powerful ones) forbidding contraception.
A group that has a decisive opinion that is the direct opposite of the anti-contraceptionists is the anti-natalists, led by the prominent South African philosopher David Benatar. Benatar argues that, since all life contains some suffering, it is immoral to create any new life. He does not accept that suffering may be offset by pleasure at other times in a life. Even a few moments of mild pain in an otherwise long, happy life makes the creation of that life a moral mistake, in Benatar’s book. Less extreme anti-natalists argue that procreating is OK if we think the new life will have more pleasure than suffering but that, since we can’t be sure, we are obliged to not procreate. A more folksy version of this is the comment uttered at many a late-night D&M discussion, that ‘this is no world to bring an innocent child into‘.
Not many people are anti-natalists. Most people, despite the exaggerated doom and gloom on the news – terrorist this and serial-killer that – (of course no mention of the real dangers like climate change, malaria, poverty, road carnage and plutocratic hijack of our democracies) see life as a generally pleasant experience and look positively on conferring it on new humans. But that tends to be a very personal feeling, in which the moral dimension cannot be disentangled from the powerful personal urge to procreate.
For those of us who are neither anti-natalists nor anti-contraceptionists, the question of those lives in the intervening 199,000,800 years remains a mystery to be explored. Is it important that they come to pass? Is it good that they do so?
Lest you decide I sound like a homicidal maniac and ring Homeland Security to have me ‘dealt with’, let me state here that I feel that it is better to do what we can to delay the end of the world. That’s a major factor in why I think action on climate change is the most important issue facing humanity today. But I won’t go into the reasons why in this essay, because this topic will be discussed at my upcoming philosophy club meeting and I want to avoid spoilers. In any case, I’m more interested in what other people think about this.
The dilemma posed by this essay was first raised by Oliver Kirk.
Bondi Junction, April 2017
- What, if any, obligations do we have to unborn generations? Do they include an obligation to ensure their existence?
- Does the nature or strength of the obligation change with the remoteness of the future generation?
- If we accept that the end of humanity will occur, and will be accompanied by great suffering, are we obliged to do what we can to delay it for as many centuries or millennia as possible (taking as agreed that we are obliged to delay it beyond the lifespan of anybody currently alive)?
- If we do feel obliged to delay, does that imply an obligation to maximise the population of the Earth, subject to being able to maintain adequate living standards?
- How do I feel about the fact that a time will come when there is no more life? Does it strip life of meaning? Or does it enhance meaning? Or neither?
- How would I feel about a world in which human reproduction became impossible?
- Do I feel differently about the world ending in 200 years from how I feel about it ending in 200 million years?
- What implications do our opinions on the above have on our feelings of what stance we should take on current future-oriented issues like climate change, balancing government budgets, infrastructure building, asteroid mapping, solar flare prediction?
PD James: ‘The Children of Men’
Jacqueline Harpmann: ‘I Who Have Never Known Men’ (‘Moi qui n’ai pas connu les hommes’)
Peter Singer: “Practical Ethics’. Discussion of obligations to future generations on p108-118 of Third Edition (2011, Cambridge University Press).
Here are my answers to the puzzle about what happens when we point a camera at its monitor. The problem has a flavour of infinite regress about it, and sounds a little like a Buddhist koan – a question designed to make us realise (amongst other things) the limitations of logic. But whereas a ‘correct’ answer to a koan is usually something bizarre like barking like a dog, or hitting the questioner with a stick, the four questions I posed do actually have perfectly logical answers. And I find them quite interesting.
Let’s kick things off by showing a picture of what one might see when one looks into the webcam that is perched on top of the monitor, looking outwards. This is a screenshot of what my monitor showed when I did that (Figure 1).
The image is framed within the window labelled ‘Cheese’ – which is the name of the webcam program I was using. That’s me wearing the red cravat.
When we turn the camera around and point it at the monitor, we will see an infinite regress of windows within windows, as the whole picture will be reduced and fitted into the image area where I am above. Then that reduce-and-insert step will be repeated as many times as it takes until the reduced image gets down to a single pixel and can contain no more strip-images. Here’s an image I made of what it should look like (Figure 2):
In every window, the green desktop background, the desktop icons and the file explorer window to the left are reproduced, and the series shrinks off into the distance. I can tell you it was pretty fiddly putting the tiniest innermost parts into that picture. Infinitely small objects are notoriously difficult to manipulate.
The picture looks a little like a classical picture that uses perspective to show a long, straight road disappearing into the distance, with the point of disappearance at which all the lines converge being in the top-right quadrant of the screen.
But there’s an important and fascinating difference: the dimension along which the images disappear here is of time, not distance. That’s because each nested window is an image captured by the camera’s sensor a short interval earlier than the window that contains it. That interval will vary slightly as we move through the nested sequence, based on the relationship between the rate at which the monitor screen is redrawn (the ‘refresh rate’) and the number of snapshot images captured per second by the camera sensor (the ‘frame rate’). But it will always be more than some minimum value that depends on how long the information takes to travel along the wire from the sensor, through any processor chips in the camera, along another wire to the computer, through any processing algorithms in the computer, and then through the video cable to the monitor. Even without knowing anything about the computer, we know that that time – called the ‘lag’ – will be greater than the lengths of the wires involved, divided by the speed of light (because electrical signals cannot travel faster than light).
So each nested window is earlier than the one outside it, and as we look through the sequence of windows towards the point of convergence at infinity, we are looking back through time!
Now we maximise the Cheese window. First let’s see what it looks like with the camera the correct way around, pointing at me (Figure 3):
You can tell from my expression that I’m quite enjoying this little exercise, can’t you?
Here there is nothing outside the Cheese frame, but the Cheese frame still has a broad, non-image bar at the bottom, a narrower bar at the top, and black vertical bars at either side, which are needed to preserve the image’s ‘aspect ratio’ – the ratio of its width to its height.
With that setting, we turn the camera on the monitor, and this is what we would see (Figure 4):
The lower, upper, left and right bars are reproduced as a series of receding frames, and there is nothing in view other than the receding frames. There is no room left for an actual image of anything other than frames.
Figure 4 gives an even better sense of the ‘time tunnel’ that we mentioned in the previous section. Those white borders really do regress away in a spooky way. It looks like something out of Doctor Who.
The ratio of the height of one window to the height of the window immediately inside it is 1/(1-p) where p is the sum of the heights of the upper and lower margins of the outermost window divided by the total height of the outermost window. The ratios of the widths is the same. In this case it looks like p is around 1/5, so each window will be about 4/5 of the height of the window that contains it.
I used my webcam and shaky hands to try an empirical verification of this. I maximised the Cheese window and, pointed the hand-held webcam at the monitor, centering it as closely as I could. Then I asked my partner to press the Screenshot button on the keyboard to record what the monitor was showing. Below is what we got (Figure 5).
It’s a bit rough, but you can see it does the same sort of thing as Figure 4.
When one is holding the camera like this, the little involuntary movements one makes cause the trail of receding frames to wobble left and right in waves, that remind me of the effects used in 1970s television to produce a psychedelic impression – particularly prevalent in rock music film clips.
Here’s a link to a video I captured of this effect.
Question 3 asks what we will see on the monitor after we click the full-screen icon.
When we click the full-screen icon, we have no borders, so there can be no infinite, reducing regress of nested borders. How can we work out what is shown, assuming that we started in non-full-screen mode with a maximised window, so that the monitor was showing Figure 4?
The answer turns out to be remarkably simple. We just note that, when the full-screen icon is clicked, the computer will do some computing and then redraw the screen using the whole monitor area for the image from the camera. When it has finished the computing it will draw the screen using the image received most recently from the sensor and, since that image was captured before the screen redraw, it will be the same as whatever the screen was showing previously. This assumes that the previous image is left in place on the monitor until the computer is ready to draw the new one. We consider later on what happens if that is not the case.
That re-drawn screen will then be captured again by the camera sensor, sent to the computer and then drawn again on the monitor, and so on. So the image will remain exactly as it was before the full-screen icon was clicked! In this case, since the Cheese window was previously maximised, it will continue to show something like Figure 4.
The image remains static until either the camera is pointed away or the computer is switched out of full-screen mode, using the keyboard or mouse. Barring earthquakes, electricity blackouts and such-like, we would expect the monitor to still be displaying Figure 4 if we locked the room it was in, went away and returned to inspect it ten years later.
We can understand this a different way by considering the time tunnel we talked about in the responses to questions 2 and 3. In those cases, as we travel inwards through the tunnel to successively smaller windows, each window’s image was captured a short while earlier than the image of the window around it. The interval between the capture time of those windows will be much less than a second, typically 1/24 seconds. In Figure 4 each window’s height is about 4/5 of the height of the window that contains it. So to see what the monitor was showing t seconds ago we have to go to the nth window in the sequence, counting from the outside, where n=24t. The height of that window, assuming the height of the monitor’s display area is 300mm, is 300 x 0.8-24t mm. The window size reduces rapidly as we go back through time. A little calculation shows that the 26th window in the sequence is the first to have height less than 1mm, and that window shows what the monitor was showing just over one second earlier.
Without constraint, that time tunnel would continue to go back, getting smaller at an increasing rate, window by nested window (like Russian dolls, or the cats in the Cat in the Hat’s hat), until we got to the time before the camera program window was opened on the computer, and that image would show whatever was on the monitor before the window was opened. But it would be indescribably tiny. If we opened the window – in maximised form – five minutes ago, the height of the window that now showed that image from back then would be 300mm x 0.8–7200, which is approximately 10-688 mm. This is indescribably smaller than the smallest atom (Hydrogen, with diameter 10-7 mm) or even just a proton (diameter 10-11 mm).
I expect our eyes probably could discern no more than the first twenty windows in the sequence. Further, since my screen has about three pixels per mm, the windows would reach the size of a single pixel by the 31st window in the sequence, and the regress would stop there. Hence the sequence would look back in time no more than 1.3 seconds.
When we move to full screen mode, we still have a time tunnel of nested windows, but each one is exactly the same size as the one before it – the height ratio is 1, rather than 0.8. That means that how far we look back in time is no longer limited by shrinking to the size of a pixel, and the sequence will go all the way back to the last image the monitor showed before drawing its first screen in full-screen mode – which will be Figure 4.
In practice, as my friend Moonbi points out, the slight distortions in the image arising from imperfections in the camera lenses, although they may be imperceptible at first, will compound on each other with each layer of nesting so that what is actually shown will be a distorted mess. Like a secret whispered from one person to another around a large circle, or a notice copied from copies of itself dozens of times recursively, the distortions – however tiny they may be at first – will grow exponentially to eventually dominate and destroy the image. One minute after full-screen mode has been commenced, the monitor will be showing a 600 times recopied image of Figure 4, which will be more than enough to obliterate the image.
But this is a thought experiment, so we allow ourselves the luxury of assuming that are lenses are somehow perfect, that there is zero distortion, and each copy is indistinguishable from the original.
What if the screen blacks out?
Above we assumed that the display does not change until the computer is ready to redraw the full-screen image. If you go to YouTube, start playing a video and then click the full-screen icon (at bottom right of the image area) you will see that is not what it does. It actually makes the whole screen go black for a considerable portion of a second, and only then redraws the screen. If the camera program we are using does that then the screen will go black and remain black indefinitely.
If the black-out is shorter than YouTube’s, different behaviour may arise. It depends on four things:
- the time from image capture to display, which we call the lag and denote by L,
- the interval between image captures, which we call the frame period and denote by T, and
- the time the blackout period commences and the time it ends, both measured in milliseconds from the last image capture before the blackout. We’ll denote these by t1 and t2.
If no image capture occurs during the blackout, which will happen if t2<T, the blackout will have no effect on the final image and we can ignore it. The eventual image will still be Figure 4.
If images are captured during the blackout, and the first image shown on the monitor after the blackout was captured during the blackout, the screen will thereafter remain black indefinitely. This will be the case if both L and T are less than t2.
The other possibility is that T<t2 and L>t2. In this case the first image shown after the blackout will be a picture of the pre-blackout monitor, ie Figure 4, but it will be followed sooner or later by one or more black images captured during the blackout. What will follow then will be an alternation between images showing Figure 4 and black images. It will look like a stroboscopic Figure 4. The strobe cycle will have period approximately equal to L, and the dark period will have approximately the same length as the blackout, ie t2–t1. In essence, the monitor will indefinitely replay what it showed in the period of length L ending at the end of the blackout.
This black-screen issue will also arise if the monitor is an old-style, boxy, CRT (Cathode Ray Tube) rather than a LCD, Plasma or LED device. CRT screens typically draw around 75 images per second, made up of bright dots on a photo-sensitive screen, by shooting electrons at it. In between those drawings, the screen is black. That’s why those screens sometimes appear to flicker, especially viewed through a video camera.
For a CRT screen, the image captured immediately before the redraw may be Figure 4 or black, or something in-between – a partial Figure 4, with a complex dependency based on four parameters: the length of exposure used by the sensor (shutter speed), the length of time taken for a single redraw of the CRT screen and the refresh rate and frame rate. Unless there were a particularly unusual and fortuitous relationship between those four numbers, the image on the monitor would not be Figure 4. I think instead it would either be just black or an unpredictable mess. But one would need more knowledge of CRT technology than I have, to predict that.
Anyway, we don’t want to get bogged down in practical technology. This is principally a thought experiment. And for that ideal situation, we assume an LCD monitor with a computer that, upon receiving a full-screen-mode command, leaves the prior image in place until it is ready to redraw the image on the full screen. And the answer in that situation is Figure 4.
Doing a careful experimental verification of this is beyond me because, amongst other things, I don’t have a camera program that has a full-screen mode. But just for fun, I made a video like the one above under Question 4, where I focused the camera on the area of the monitor that displayed the image, trying to exclude the borders. It wobbles about, partly because of my shaky hands. It is mostly black, but there’s a blue smudge that appears in the lower half and wobbles around. I think that is a degraded version of the regressing images of the lower border. But under such uncontrolled conditions, who knows?
We are now in a position to work out the answer to question 4, which is ‘what will we see after we point the camera to the right of the monitor and then pan left until it exactly points at the monitor, and stop there?‘
To start with, we know that, once the camera has panned to the final position of exactly capturing the image of the entire monitor, it will hold that image indefinitely, and that image will be whatever the monitor was showing immediately before the camera finished panning.
We’ll be a little more precise. A video camera captures a number f of images (‘frames’) per second, typically 24. The final image shown by the camera will be whatever the camera captured in the last frame it shot before completing the pan. The nature of that image depends on relationships between the pan speed, the frame rate and the width of the monitor screen, which we will explore shortly. But, to avoid suspense, let’s assume a frame rate of 24, that 32 frames are shot while performing the pan, and that the view to the right of the monitor display area, including the black right-hand frame of the monitor itself, being this (Figure 6):
Then, on completion of the pan, the camera will show, and continue to show indefinitely thereafter, the following image (Figure 7 – note that the image was made by editing, not shot through a camera. My equipment is nowhere near precise enough to do this accurately):
C’est bizarre, non?
Those of you that enjoy exploring intricate patterns may wish to read on, to see the explanation of this phenomenon. I will not be offended if most don’t.
The answer depends on the ratio of the speed at which the camera pans left, to the width of the screen. If we want to be precise, these speeds and widths must be measured in degrees (angles) rather than millimetres. But millimetres are easier to understand so we’ll use them and ignore the slight inaccuracy it introduces (otherwise I’d need to start using words like ‘subtend’, and we wouldn’t want that would we?).
Say the camera rotates at a speed of s mm per second, and that it shoots f frames per second. Hence it shifts view leftwards by s/f mm per frame. So, if the width of the monitor’s display area is w mm, it shoots N=wf/s frames between when the camera first captures part of the monitor’s display area (on the monitor’s right side) and when the camera is in the final position where it exactly captures the view of the whole monitor display area, not counting the last frame. We label the positions of the camera at each of those frames as 1 to N, going from earliest to latest. We omit the last frame because we know that, once the camera is pointing exactly at the monitor, the image will remain fixed on whatever the monitor is showing at that time, which will be what the camera saw in position N.
By the way, we assume that N is an integer even though in practice it won’t be, because it will have a fractional part. It doesn’t make the calculation significantly more difficult if it’s a non-integer, but it is messier, longer, and the differences are not terribly interesting, so we’ll assume it’s an integer.
Divide the view to the right of the monitor’s display area, as shown in Figure 6 above, into N vertical strips of equal width. Number those strips 1 to N from left to right. We will call these ‘strip-images‘, as each is a tall, thin picture. Next, number the positions the camera has when each frame is shot as follows:
- Frame 0 is when the left-hand edge of the image captured by the camera coincides with the right-hand edge of the monitor display area, so that the camera captures exactly the image of Figure 6, ie strip-images 1 to N. At that time the monitor will be showing what the camera captured one frame earlier, which will be an image made up of strip-images 2 to N+1 (strip-image N+1 is what we can see in an area the same size as the opther strip-images, immediately to the right of Figure 6)
- The frames shot after that are labelled 1, 2, etc.
Label the times when the camera is in position 0, 1, 2 etc as ‘time 0’, ‘time 1’, ‘time 2’ etc.
With this scheme, Frame N will be the one that is shot when the camera is in the final position, when it exactly captures what is on the monitor, so that that image remains on the monitor indefinitely thereafter.
The following table depicts what is shown by the monitor and what is captured by the camera at each position, in the situation we used to produce Figure 7, which has N=32.
The rows are labelled by the camera positions/times. The first 32 columns (the ‘left panel’) correspond to the 32 vertical, rectangular strips of the monitor display area, numbered from left to right. The next 32 columns (the ‘right panel’) correspond to the strips of what can be seen to the right of the monitor. The number in each cell shows which strip-image can be seen by looking in that direction. The yellow shading in each row shows what images are captured by the camera at that time, to be shown on the monitor in the next row. (Figure 8):
A few key points of interest are:
- The numbers in the right panel do not change from one row to the next, because rotating the camera does not change what can be seen to the right of the monitor.
- The numbers in the left panel change with each row, to reflect that what was captured by the camera at the previous camera position was different from what was captured at the one before that, because of the camera’s movement.
- The yellow area denoting what the camera captures moves to the left as we go down the table, reflecting the camera’s panning to the left.
There are lots of lovely number patterns in the left panel, which I will leave the reader to explore.
Here is a zoomed-in image of just the left panel for those whose eyes, like mine, have trouble making out small numbers (Figure 9):
Referring back to Figure 7 we see that, as we move from the right to the left side, it has a series of eight vertical images of increasing width. The first one is just the monitor’s right-hand frame – a black plastic strip. The next is twice as wide and has the monitor frame plus the strip-image to its right. The one after than is three times the width and so on. This corresponds to the last row of Figure 9:
5 6 7 8, 1 2 3 4 5 6 7, 1 2 3 4 5 6, 1 2 3 4 5, 1 2 3 4, 1 2 3, 1 2, 1
I have put commas between each contiguous set of strip-images. I call each such contiguous set a ‘sub-image‘. The first sub-image is incomplete – being 5 6 7 8 instead of 1 2 3 4 5 6 7 8 – because N is not a triangular number.
The time tunnel applies here, with a slightly different flavour. The newest sub-image is the one on the far right, composed solely of strip-image 1. This was captured from the world outside the monitor one frame period ago. The next, the ‘1 2’, was shot from the monitor last time, and came from the real world outsider the monitor two frame periods ago. The oldest sub-image is the one on the left, which has been through the camera-monitor loop seven times, having first been captured from the real world eight frame periods ago.
It’s quite fun to trace the path of these images as they repeatedly traverse the camera-monitor loop, by the numbers in the above table. Here’s the lower part of the table showing how the first, third and sixth sub-images from the left (using blue, grey and pink shading respectively), make their way from the real world (to the right of the vertical dividing line), into the camera-monitor loop (to the left of that line), around that loop as many times as needed, and finally to the ultimate static image (Figure 10):
My example with N=32 involves a high panning speed. Shooting 24 frames per second, the pan would need to to completed in 32/24 = 1.33 seconds. One would need extremely good equipment to accomplish that without getting a bounce or wobble when the camera stops at the end of the pan – and avoiding wobble is critical to getting the indefinite static picture we have discussed.
It may be that in order to avoid camera bounce one would need a slower pan, giving us a (perhaps much) higher N. What would be the outcome of that? Well, the right-most sub-image contains only one strip-image and, as we move left, each image contains one more strip-image than the one to its right. So, if the number of sub-images is r, then N will be greater than the sum of the numbers from 1 to r-1 (the (r-1)th triangular number) and not exceeding the sum from 1 to r (the rth triangular number). A little maths tells us that this is from (r-1)r/2 to r(r+1)/2. For large N this means that there will be approximately √(2N) sub-images and the largest will be comprised of about √(2N) strip-images. Since for a display width of w the width of a strip-image is w/N, that means the widest sub-image will have width about w√(2N)/N=w√(2/N), which will get smaller as N increases. If N=2048, corresponding to a very slow pan time of about 85 seconds, the widest sub-image would be narrower than the frame of my monitor, so all we would see in the final static image would be black plastic monitor frame, something like this (Figure 11):
The bars of light are from the screen reflecting on the shiny monitor frame. Because I couldn’t hold the camera straight, they are bigger at the bottom than at the top.
I will leave you with my synthesis of the sequence of 32 images that would be seen, given near-perfect equipment, in the 32-step pan that ends with Figure 7 above. They are simply a realisation in pictures of the patterns shown in figure 9, when applied to the image Figure 6. The sequence is in a pdf at this address. If you go to page 1 and then repeatedly hit Page Down rapidly, you will see a slow-motion video representation of what it would look like as the camera panned. You may need to download it first in order to be able to view it in single-page view, which is necessary in order to achieve a video-like effect. Alternatively, it is represented below, albeit somewhat more crudely, as a pretend film strip.
Bondi Junction, February 2017
I was listening to a talk by Alan Watts about some aspect of Eastern mysticism. I can’t remember the exact context. I think he was describing the impossibility of truly understanding the nature of one’s own mind. He said that trying to use one’s mind to understand one’s own mind was ‘like pointing the camera at the monitor’.
I was immediately struck by this. Partly I was surprised at his using such a simile, which involves common enough concepts in 2017, in a talk that he gave in the sixties, when computers only existed in large research establishments and occupied enormous rooms. There was certainly no such thing as a webcam back then. I realised later that he probably had in mind a closed-circuit television arrangement, which they did have in the sixties.
But beyond that, I was struck by the fact that it’s actually a very interesting question – what does happen when one points the camera at the monitor? It’s a classically self-referential problem. But unlike some self-referential problems, like the question of the truth of the statement ‘This sentence is false’, it must have a precise answer, because we can point a camera at a monitor, and when we do that the monitor must show something. But what will it show?
There are a number of practical considerations that can lead us towards different types of answers. While each of those considerations leads to an interesting problem in its own right, I tried to remove as many of them as possible to make the problem as close to ‘ideal’ as I could. So here it is.
Imagine we have a computer connected to a monitor and a digital video camera. A webcam is a digital video camera but, since the camera we are imagining here needs to be extremely accurate, a high-quality professional video camera would be more suitable. The monitor uses a rectangular array of display pixels to display an image and the camera uses a sensor that is a rectangular array of light-sensitive pixels, and the dimensions of the display and the sensor, in pixels (not in millimetres) are identical.i
On the computer we run a program that shows the image recorded by the camera. The telecommunication program Skype is a well-known such program that can do that, amongst other things. There are also dedicated camera-only programs, which webcam manufacturers typically include on a CD bundled with the webcams they sell. Let’s call our program CamView (not a real program name). We start up CamView on the computer in a non-maximised window, which we’ll call the ‘CamView window’. Then we turn the camera on and point it at the monitor. We aim and focus the camera so precisely that an image of the display area of the monitor fills the image-display area of the CamView window. Ideally this would mean that each pixel on the camera’s sensor is recording an image of the corresponding pixel on the monitor screen. In practice there will be some distortion, but we’ll ignore that for now.
Question 1: what does the monitor show?
Question 2: Next we maximise the CamView window. What does the monitor show now?
Those questions are easy enough to answer, when we remember that the window for any computer program, in default mode, typically has an upper border with tool icons on it, a lower border with status info on it, and sometimes left or right borders as well.
These questions are fairly similar to the question of what one sees when one stands between two parallel, opposing mirrors, as is the case in some lifts (elevators).
Now comes the hard one. In most video-viewing computer programs there is an icon that, upon clicking, maximises the window and removes all borders so that the image-display area occupies the entire display area of the monitor. Call it the ‘full screen icon’ and say that we are in ‘full screen mode’ after it is clicked – until a command is given that terminates that mode and returns to the default mode – ie restores the borders etc. In full screen mode the display area of the monitor corresponds exactly to the images recorded by the camera’s sensor.
Question 3: We now click the full screen icon. Describe what appears on the monitor, and how it changes, from the instant before the icon is clicked, until ten minutes after clicking it – assuming the program remains in full screen mode for that entire time.
That is the difficult one. It took me a while to figure it out, and I was surprised by the answer. It is possible that what I worked out was wrong. If so, I hope that someone will point that out to me.
I have one more question, and it has an even more peculiar answer – one that I found quite charming.
Question 4: Assume the camera is mounted on a very stable tripod. Still in full-screen mode, we pan the camera to the right until it no longer shows any of the monitor. Then we pan the camera back at a constant speed until it again sees only the display area of the monitor, and we stop the panning at that point. What is visible on the monitor after the camera has panned back to the original position? Does that change subsequently? What does it look like ten minutes later? Does the monitor image depend on the panning speed, or on the number of frames per second the camera shoots? If so, how?
In order to avoid spoiling anybody’s fun in trying to work out the answers to these puzzles for themself, I will not post answers now. I will post them a little later on. It will also take me a little while to make some nice pictures to help explain what I am talking about.
Bondi Junction, February 2017
i Although most camera sensors have a 3:2 aspect ratio, which is different from the 16:9 aspect ratio of most modern computer monitors, it is possible on a sophisticated camera to alter the aspect ratio to 16:9, which is achieved by deactivating the sensor pixels in an upper and lower band of the sensor, so that the area used to record an image has the required aspect ratio. We’ll assume that is done and that the number of pixels in the active sensor area equals that on the monitor.