.post-body { line-height:100%; } -->

Saturday, 12 February 2022

Completely Incomplete

If there's one thing above all others I've learned over decades of lay study of science, it's that the universe really doesn't do boxes. Chimps in couture love 'em, though. We can't help ourselves. We go blithely about sticking things in containers, impressing ourselves immensely with our ability to classify, constrain and label. 

To be fair, they can be incredibly useful. A good box with a well-defined boundary can... But hang on, maybe I need to talk about what I mean by 'boxes'.

A box is a boundary. A binary; in/out; yes/no; 1/0. It's a definition. In fact, 'definition' is pretty much the entirety of what I'm trying to convey here, and I could just as easily write the same post using that term, but picture a physical box anyway, as thinking of it as a constraint will have some value

In the jargon of the mathematician, we can most closely relate it to 'set' while, in the jargon of the physicist, we'd more closely relate it to 'system'. In biology, the nearest term is 'clade' (any species and all its progeny). In reality, what they are is patterns we perceive and the stories we tell ourselves about them. Any time you see an obviously important noun in what follows, think 'box'. 

This suite of pattern-seeking and categorisation skills is, as with many of our abilities, traceable to our deep evolutionary history, and is a critical component of our survival strategy as a species. The ability to spot a pattern and stick it into a box deprived more than one predator of an entrée, and is the essential foundation of the modern world. They're key to developing good intuitions, allowing us to categorise variables and garner expertise*. We have to be a little careful though, because these boxes don't only constrain the things we put in them, they also massively constrain the way we think about the things we've put in them. 

Still, there's one box of particular interest, and I'm going to label it 'what I know'. I'm not referring here to my specific example, this is a Platonic 'what I know' box, the essential 'what I know', of which each of us has their own perception. This box is, for each the wellspring from which our intuitions derive, and it's an incredible tool, but there are some problems. The first is a problem in specifying boxes. Let's start with a story:

Pierre-Simon de Laplace was a French polymath - some cite him as a prime candidate for 'the last man who knew everything' - whose contributions to the sum of human knowledge were vast and broad-reaching, spanning mathematics, physics, engineering, probability and statistics... he was the Dave Grohl of modern science.

He also famously said something we would recognise as our classical picture of how the world works. In 1814 after reading Newton's masterwork, he remarked if one could grasp the positions and momenta of all the particles in the universe, coupled with knowledge of all the forces and the wit to analyse them robustly, one could "embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes."[1]

This was probably the first properly scientific expression of what we'd now recognise as 'causal determinism'. There's a problem lurking in there, though (there are many, but this is the most relevant for our purpose here), and it's all to do with specifying a box, this one labelled 'system'.

In classical mechanics, the configuration of a system is completely specified with two pieces of information - position and momentum (\(x\) and \(p\) respectively)*. If you know these two quantities, you can wind forward and backward in time and make accurate predictions about everywhere it's been and everywhere it will go and when. This is extremely intuitive, which is hardly surprising because, miraculous as it is, the macroscopic world behaves very differently to the real world, to such a degree as to have taken several colossal revolutions in thought merely to furnish us with the tools required to notice the real world was there!

In quantum mechanics, it isn't possible to completely specify a system, if we're operating on this definition of 'specification'. Why? Because we can't know both the position and momentum of a particle at the same time. Know, you say? In fact, it doesn't even make sense to talk about them existing at the same time. 

There's a wonderful analogy, originally formulated by Bertrand Russell for a completely unrelated purpose, which I decided to modify slightly here for reasons I hope will become apparent later, as I begin to tie the disparate threads of this post together.

You can liken the differences between classical mechanics and quantum mechanics to the difference between shoes and socks. We'll take an observable quantity - angular momentum about the \(x\) axis - and another - angular momentum about the \(y\) axis. We'll call these quantities \(l\) and \(r\) respectively.

Look at a pair of shoes. It's pretty simple to label them \(l\) and \(r\). They have intrinsic \(l\) and \(r\) -ness. Their 'chirality' (handedness) is an objective fact about them. One fits the left foot, the other fits the right foot.

What about socks? They're a little trickier. They have no intrinsic chirality. In fact, it doesn't make any sense to talk about their chirality until an observation is made. In physics, it means we need an interaction. Of course, we have very specific ways to interact with socks, and their chirality comes about only as a result of these interactions. The interaction changes them and, once the interaction is complete, they lose their chirality and go back to being non-quantities. This is directly analogous to measuring the position of a 'particle'. As soon as your 'measurement' is complete, you have no idea of where it is or how fast it's moving. Of course, if you have sweaty feet and don't change your socks often, they lose elasticity and fail to return to shape, so the analogy falls apart, much like your socks. 

Yes, yes. I know. I always stretch analogies too far. This is, however, how our \(x\) and \(p\) provably and measurably behave in the real world**. Quantum entities behave far more like socks than they do shoes. One could argue, in fact, shoeness is an emergent property arising from the behaviour of socks. I won't go further here, except to note the existence of better analogies, some of which I've employed in more complete posts about quantum theory. Links in the usual place.

Suffice it to say for succinctness quantum mechanics requires a different definition for 'specification'. We have to modify our expectations of what we can 'know'. Since we can't specify \(x\) and \(p\), we have to find something else to work with. I'm not going to discuss the wavefunction (\(\Psi\)) here, not least because I have plans to deal with it in some considerable detail in future posts. What matters is having a way to completely specify a system so it can be evolved in time and space in a way which allows us to make predictions.

"But this is all physics, what about mathematics?" say you. Damn, I was hoping you wouldn't notice. Ah, well. Here goes, then.

What the above should demonstrate reasonably well is this; assumptions are dangerous to the progress of knowledge. One might argue, in fact, the most formidable barrier to knowledge is the certainty we already possess it. This is what our assumptions really represent unless we learn to treat them with care. Not for nothing is the central aim of philosophy 'how to identify presupposition failure', as defined by one professor of philosophy of my acquaintance.

My friend Phil Scott, mathematician and computer scientist, wrote a wonderful piece here about the foundations of mathematics as they shifted from geometry to set theory. He tells us:

An exodus from geometry occurred in the 18th century, and was completed when the geometric foundation itself fell into total crisis. Remember the four axioms from Euclid I gave above? I missed out the fifth, not simply because it is an overly complex axiom, but because its status as axiom was in question from the early commentaries of Euclid. It is an axiom governing parallel lines, but perhaps the most familiar of its consequences is that the angles of a triangle add up to 180 degrees.
With the axiom’s status in doubt, several mathematicians tried to show its necessity by showing that its denial would be commitment to absurdity. In the late 18th century, they began exploring strange and speculative worlds in which the angles of a triangle do not sum to 180 degrees.

And, as a result...

The solid foundations of geometry were replaced with a fluid space of infinite geometries, a shifting sand that was no place to build the rest of mathematics.

The point is, absent Euclid's axioms, mathematics had no axiomatic foundation. This is a bit of a problem for an axiomatic-deductive system of logic, as it can't be difficult to appreciate. Enter set theory.

Of course, there's more to the story than a simple cape-enshrouded leap onto the stage to save the day, as this glib comment might imply. I recommend reading Phil's delightful piece, as it provides some critical context and is a thing of profound beauty.

In a nutshell, set theory is the idea things can be gathered into groupings called 'sets' based on shared characteristics. It's a source of some amusement to me the word 'set' has the largest set of discrete definitions of any word in the English language, at something like 430 entirely different meanings. You might say the set S of all statements defining the word 'set' has 430 members (and none of them pay the greens fees, the bastards...)

Set theory owes much of its existence to Georg Cantor. He was wrestling with the infinite and, in particular, the notion there could be different "sizes" of infinity. This might seem odd, but it's not incredibly difficult to see with an example. Take two sets of numbers, \(\mathbb{Z}\) and \(\mathbb{R}\). I can tell you all sorts of things about them but the most interesting is how large they are. I chose these particular terms for a reason; they're conventionally used to define two sets of numbers.

\(\mathbb{Z}\) is the integers. The integers are all the whole numbers from minus infinity to infinity (\(-\infty,\infty\)). it's conventional at this point in any explanation to show a plot on a line to make this intuitive, so let's do that. Here's a snippet of the number line for set \(\mathbb{Z}\):

No real surprises to be had there. Now let's look at an example of the set \(\mathbb{R}\):

As we should be able to see, the set \(\mathbb{R}\) contains not only all the integers, but also all the numbers between them. It contains all the rational numbers (numbers expressible as the ratio between two numbers, a.k.a. fractions), all the irrational numbers (numbers not expressible as the ratio between two numbers. Pi (\(\pi\)) is an example, but any number whose decimal expansion is infinite is irrational). In fact, for any two lines you choose at random, there is an infinite number of numbers between them. \(\mathbb{R}\) is the real numbers.

To a non-mathematician, the idea of different-sized infinities sounds insane, even after more than a century of trans-finite mathematics, but just these two core examples show it must be true. It's intuitively obvious, even to a non-mathematician, the sets \(\mathbb{Z}\) and \(\mathbb{R}\) are not the same size, despite both being infinite. The former set is quite obviously a subset of the latter, so they can't possibly be the same size. 

This was a problem, and we needed a way to treat it mathematically. Cantor had a solution, though. He introduced the notion of 'cardinality of infinites'. As the name might suggest, what he introduced was a way of counting with infinities of different sizes. The cardinal numbers are the 'counting' numbers, non-negative integers, or 'natural numbers' (\(\mathbb{N}\)), to use the proper term. 

The cardinality of a set is simply the number of members. This is defined in terms of bijection - essentially direct comparison to other sets. In the jargon, two sets have the same cardinality if there is a direct one-to-one mapping, one to the other. For example, the little pigs in a popular children's story and the blind mice in a nursery rhyme have the same cardinality; 3. There's a direct bijection between the mice and the pigs in these sets.

Another important term to introduce is 'countable'. This is a source of some confusion, because infinities are uncountable under any vernacular definition of 'countable'. The meaning here is very specific, though. An infinity is countable if and only if there is direct one-to-one mapping with the natural numbers. It's an in-principle countability, rather than a practical one; a countability of orderliness, if you like.

The cardinality of the set \(\mathbb{N}\) of all natural numbers is denoted Aleph-nought (\(\aleph_0\)). This is the smallest infinite cardinality. The set \(\mathbb{Z}\) is also \(\aleph_0\), as it has a direct one-to-one mapping with \(\mathbb{N}\). This might seem counter-intuitive, but Cantor proved the cardinality of any unbounded subset of \(\mathbb{N}\) is the same as the cardinality of \(\mathbb{N}\) itself.

What about \(\mathbb{R}\)? You might well ask...

Unsurprisingly, there is no one-to-one mapping between \(\mathbb{R}\) and \(\mathbb{N}\). \(\mathbb{R}\) has a larger cardinality, \(\aleph_1\). In fact, the cardinality of the reals has a special place and a special designation \(\mathfrak{c}\). Its calculated cardinality is in fact \(2^{\aleph_0}\) but, due to a hypothesis forwarded by Cantor, this is taken as equivalent to \(\aleph_1\). This 'continuum hypothesis' is as follows: there is no set whose cardinality lies between that of the natural numbers and that of the reals. This hypothesis hasn't been proven, but it has some important consequences. It was the first of the problems forwarded by David Hilbert when laying out his vision of how mathematics should progress in the coming century. Hilbert's vision will have some import, so we'll come back to it shortly. 

So we have our arithmetic foundation, and all seems well with the world, but there's a problem lurking in there. It's not immediately obvious, but it's a biggie, and it has a lot to do with the fact, other than some bits of additional notation, this set theory is constructed in natural language. Regulars here will already be hearing cognitive klaxons sounding, because we've looked in excruciating detail at the pitfalls of natural language, where misunderstandings can arise, where ambiguity creeps in...

Enter the Mad Hatter. 

By Mad Hatter, of course, I mean the great Bertrand Russell, upon whom the illustration of the Mad Hatter in Carroll's early editions is allegedly (and credibly) based†.

What if, he reasoned, you constructed a set of all sets not members of themselves? This set would have to be a member, but then it would be a member of itself, so it couldn't be, which meant it had to be included... and so on. Is your head exploding yet? 

Well, this little paradox comprehensively exploded set theory as it stood, leaving arithmetic once again without a foundation. We needed something consistent - free of contradictions. We needed something complete. 

This was the meat of the second of Hilbert's problems, namely to prove the axioms of arithmetic consistent.

The obvious route to progress was to reconstruct set theory. It had many incredibly useful features. We just had to be careful about our definitions to remain free of paradox. One way to do this is to formalise it, to remove the potential ambiguities of natural language. Russell did some work on this himself, founding 'type' theory, still used in some settings today (such as Church's \(\lambda\) calculus) Ernst Zermelo constructed a system which, once some inconsistencies were ironed out with input from Abraham Fraenkel, is now pretty much universally recognised as the foundation of arithmetic. 

We're not entirely out of the woods yet, though. There are still some potential issues lurking, even with Zermelo-Fraenkel (ZF) theorem. First, there's an axiom associated with ZF theorem historically seen as controversial. It's known as the 'axiom of choice' (AC), and it was constructed by Zermelo in his proof of his 'well-ordering' theorem, the idea every set can be well-ordered (in context, this is in fact logically equivalent to AC). In a nutshell, it says it's possible to select at least one item from a collection of boxes each containing at least one item, even if the number of boxes is infinite. For the most part, this is fairly uncontroversial. 

We can talk again about shoes and socks here and, in fact, this is the original setting of this analogy. If your boxes contain shoes, the assertion some rule can be applied to selecting one from each box is trivial, to the extent one would think having an axiom saying you can make a selection might seem a bit silly. But what about socks?

Well, with socks, we don't actually have any means of distinguishing between them. Here's where the axiom of choice comes into play. In essence, the axiom of choice simply asserts the existence of some applicable criterion, regardless of whether we actually have access to any such criterion. 

Extending this a little to the notion of cardinality, think about how you'd go about ordering the set \(\mathbb{R}\). Bearing in mind it includes all decimal expansions, all fractions and every possible number on the number line, and between any two numbers there is an infinite number of other numbers, and you begin to see why imposing the axiom of choice is useful, and also its relationship to the well-ordering theorem. This is also why the set \(\mathbb{R}\) is uncountable and, in fact, what it means for an infinite set to be uncountable.

You can see why this axiom might make some mathematicians uneasy. At this point, it's unclear whether AC is entirely free of paradox, but ZFC (Zermelo-Fraenkel theorem with the axiom of choice) is widely accepted as the foundation of arithmetic. We can breathe easy.

Or can we? let's look again at what we needed...

We needed something consistent - free of contradictions. We needed something complete.

So, is ZFC consistent? It appears to be. Is it complete? What do we mean by complete anyway?

Here's a definition: A set of axioms is syntactically complete if any statement rendered in the language of the set can be proved or disproved with the axioms of the set alone.

Let's take an example we've met before. Euclidean geometry.

We can think of Euclid's postulates as a set of axioms. 

1. To draw a straight line from any point to any point.

2. To produce a finite straight line continuously in a straight line.

3. To describe a circle with any centre and distance.

4. That all right angles are equal to one another.

5. That, if a straight line falling on two straight lines makes the interior angles on the same side less than two right angles, the two straight lines, if produced indefinitely, meet on that side on which are the angles less than the two right angles.

We know the fifth postulate to be problematic. We can construct - and demonstrate the existence in the real world of - geometries in which this postulate simply doesn't hold; non-Euclidean geometries, in other words. General Relativity, for example, describes just such a geometry for spacetime. We can traverse a right-angled triangle whose internal angles do not sum to 180 degrees; whose internal angles are, in fact, all right. Start at the North Pole, walk one mile South, turn 90 degrees and walk one mile East, then turn North and walk for one mile. You've just traversed a right-angled triangle with three right angles and three identical hypotenuses††. Eat that, Pythagoras! 

Of course, if we remove the fifth postulate, Euclid's axioms are incomplete, because the fifth postulate is a statement expressed in the language of the set which can neither be proved nor disproved from the remaining axioms alone.

So what about ZFC? Are there any statements expressed in its language which can't be proved or disproved from the axioms of ZFC alone?

There are, and we've already met one; the continuum hypothesis. In fact, the proof the continuum hypothesis could be neither proven nor disproven with the axioms of ZFC came in two stages. The proof it couldn't be proven came in 1963 from Paul Cohen, who further proved both the continuum hypothesis and the axiom of choice were in fact independent of ZF theory. The proof it couldn't be disproven (again, along with the axiom of choice) came earlier in 1940 from somebody who I'm gently wending my way towards, because he's going to drop a bomb any minute.

So Russell cracked on with type theory, and set about resolving Hilbert's second problem. He constructed, along with Alfred North Whitehead, a three volume work, the Principia Mathematica, a complete, ground-up complete and consistent axiomatisation and formalisation of mathematics, published between 1910 and 1913. At least, that's what it said on the tin (maybe; we'll come back to this). There was a villain lurking in the wings, and his name was uncertainty.

What we've encountered above are iterations of Gödel's First Incompleteness Theorem. 

Properly stated:

If a (logical or axiomatic formal) system is omega-consistent, it cannot be syntactically complete.

We have to be careful here, because Gödel's Incompleteness Theorems are the subject of an awful lot of bullshit. Indeed, my first foray into learning about this fascinating area of thought stemmed from Phil Scott's gentle correction of my misunderstanding of it. The most important factor to be kept front and centre is domain of applicability and, in Gödel's theorems, it is tightly defined. All else aside, omega-consistency is a property only applicable to statements about natural numbers.

Gödel's Second Incompleteness Theorem is an extension of the first, and it states: 

The consistency of axioms cannot be proved within their own system.

In other words, there is a fundamental barrier to the existence of a complete and consistent set of axioms for mathematics. We can now imagine exhausting the limits of human imagination, yet we're prohibited from ever standing in a place from which we can say our mathematics is complete and consistent, because there are limits to where we can stand - without even considering the notion of whether such a place might be one of  pair of conjugate variables, meaning to even speak of its existence makes no sense.

I'm reasonably confident there are things I've gotten wrong here. All else aside, the domain of applicability of Gödel's Incompleteness Theorems is very narrow, and it's important not to apply this too broadly. Moreover, some of what I said about Russell and Whitehead quite possibly doesn't withstand rigorous scrutiny. Some mathematicians argue compellingly at least part of the motivation in constructing Principia was to show the expectation of completeness and consistency unreasonable. I've certainly been very suggestive in my use of language, and led you to a way of thinking about some boxes, and it's motivated by what comes next, as we go back to the very beginning. I want to look more closely at the box we labelled 'what I know', and cast an eye over it from the perspective of the one category Rumsfeld didn't invoke in his exposition of the fundamental tautologies of epistemology in the header; unknown knowns.

OK, that's going to cause some mental backflips, so let me add some stability. We each have this box we label 'what I know'. It's the place from which all our intuitions derive. As we develop expertise in a given domain, every new variable we identify and place in this box, and every solution to those variables we derive, all fuel our intuition and drive us toward better, more considered conclusions. This is, in fact, what it means to have expertise; to understand the variables in a given domain to a high degree. This box drives our thinking enormously. 

But then we interface with the boxes of other people, people who... know more. This is what I'm referring to when I talk about unknown knowns. The things we know that I don't know. The higher degree of understanding the variables. And here's the problem, of course, because my 'what I know' box for any given domain undoubtedly contains inferences in it which, because I lack some understanding of critical variables, seem perfectly consistent to me, because I literally can't think outside the box.

To invoke a favourite example, our collective 'what I know' box contained conclusions and inferences which stood the test of time, right until they didn't. Newton's Law of Universal Gravitation was the final word on gravity - even knowing the precession of Mercury's perihelion didn't match predictions - right until we discovered it was wrong. The discrepancy was tiny, amounting to only 43 seconds of arc per century, and many hypotheses were erected to explain it (this is the source of the planet Vulcan, in a sense). Einstein eventually showed it was just wrong, a reasonable approximation in a limited domain, but what did he really do? He took Newton's 'what I know' box and picked it apart. He identified two things which, although absolute and immutable in Newton's schema and fundamental to the propagation of gravity, were in fact variables, and interdependent variables at that. He showed Newton's box was inconsistent, and the inconsistency arose from not properly treating critical variables, confounders with a demonstrable impact on observables. He brought new expertise to the table, along with better intuitions and a considerably larger collective box.

And there's a lesson in thought here for us all. Sometimes, in order to understand the inconsistency in the conclusions you draw, you have to be able to think outside the box. All our conclusions seem perfectly consistent to us at all times. It could hardly be otherwise since, if we could spot the inconsistency, we wouldn't draw the conclusions. Sometimes, we have to acknowledge the unknown knowns, and recognise others have greater understanding of the variables; greater expertise; a bigger box. It should also pose a question we should always be prepared to ask ourselves when our intuition tells us something the experts in a given field accept as true is wrong; what variables are they aware of that I'm not? What do they know that I don't?

But we know that, right? Completely.


[1] A Philosophical Essay on Probabilities - Laplace 1814


Further reading:

Gödel's Theorem: An Incomplete Guide to Its Use and Abuse - Torkel Franzen. This is a truly stupendous book, a reasonably comprehensive exposition of Gödel's Incompleteness Theorems that deconstructs a lot of the nonsense. I can't recommend this highly enough.

Definitions and Axioms - A wonderful potted history of mathematics by Phil Scott.

Very Able - Why expertise is fundamentally about recognising and understanding variables. A fun story from the history of astrophysics.

Where Do You Draw the Line - How expertise changes your perception and improves intuition. A thought experiment.

Well, Blow Me Down! - When even really well-developed intuitions fail.

The Certainty of Uncertainty - Heisenberg's Uncertainty Principle and its implications.

Give Us A Wave! - Waves, waves everywhere. Waves in quantum mechanics, light and sound.

Did You See That? - Consciousness and the observer effect - socks and shoes in the real world. Why observation is about interaction, not consciousness.

The Map is not the Terrain - Semantics and the pitfalls of natural language.


*Momentum is a vector quantity, which means it contains information about both magnitude and direction. Properly, it's a composite quantity, the product of mass and velocity (\(p=mv\)).

**And not just particles, either. Quantum effects have been measured in very large objects, including a 20 kilo mirror in a hypersensitive pendulum experiment.

†Russell and Lewis Carroll were friends, both mathematicians and logicians. It is even plausible to suggest Russell's Paradox is one of the factors motivating Carroll's absurdist fiction, as a means of railing against absurdities arising in mathematical logic.

††In fact, this isn't true of Earth's surface because of a complication arising from physics. Because of the interplay of forces (pseudoforces, more accurately) the Earth is slightly oblate; it's wider around the equator than it is around the poles. Because of this, lines of latitude are longer at higher latitudes, thus the two sides of such a right triangle running North and South would actually be slightly longer than the East-West line. The right-angled equilateral describe here only applies on a perfect sphere.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.