Not too long ago I wrote about entropy, and what has come to be known as Maxwell’s demon – a hypothetical creature, invented in 1871 by James Clark Maxwell. The creature was the product of a thought experiment meant to explore the possibility of violating the second law of thermodynamics using information to impede entropy (otherwise known as the gradual but inevitable decline of everything into disorder). The only reason the demon succeeds in stopping the gradual decline of order is that it has information that can be used to rearrange the behavior of molecules, information that we cannot acquire from out perspective. That post was concerned with the surprising reality of such a creature, as physicists have now demonstrated that it could be made physical, even mechanical, in the form of an information heat engine or in the action of light beams.
I read about the demon again today in a Quanta Magazine article, How Life (and Death) Spring Fr0m Disorder. Much of the focus of this article concerns understanding evolution from a computational point of view. But author Philip Ball describes Maxwell’s creature and how it impedes entropy, since it is this action against entropy that is the key to this new and interesting approach to biology, and to evolution in particular.
Once we regard living things as agents performing a computation — collecting and storing information about an unpredictable environment — capacities and considerations such as replication, adaptation, agency, purpose and meaning can be understood as arising not from evolutionary improvisation, but as inevitable corollaries of physical laws. In other words, there appears to be a kind of physics of things doing stuff, and evolving to do stuff. Meaning and intention — thought to be the defining characteristics of living systems — may then emerge naturally through the laws of thermodynamics and statistical mechanics.
In 1944, Erwin Schrödinger approached this idea by suggesting that living organisms feed on what he called negative entropy. And this is exactly what this new research is investigating – namely the possibility that organisms behave in a way that keeps them out of equilibrium, by exacting work from the environment with which they are correlated, and this is done by using information that they share with that environment (as the demon does). Without using this information, entropy, or the second law of thermodynamics, would govern the gradual decline of the organism into disorder and it would die. Schrödinger’s hunch went so far as to propose that organisms achieve this negative entropy by collecting and storing information. Although he didn’t know how, he imagined that they somehow encoded the information and passed it on to future generations. But converting information from one form to another is not cost free. Memory storage is finite and erasing information to gather new information will cause the dissipation of energy. Managing the cost becomes one of the functions of evolution.
According to David Wolpert, a mathematician and physicist at the Santa Fe Institute who convened the recent workshop, and his colleague Artemy Kolchinsky, the key point is that well-adapted organisms are correlated with that environment. If a bacterium swims dependably toward the left or the right when there is a food source in that direction, it is better adapted, and will flourish more, than one that swims in random directions and so only finds the food by chance. A correlation between the state of the organism and that of its environment implies that they have information in common. Wolpert and Kolchinsky say that it’s this information that helps the organism stay out of equilibrium — because, like Maxwell’s demon, it can then tailor its behavior to extract work from fluctuations in its surroundings. If it did not acquire this information, the organism would gradually revert to equilibrium: It would die.
Looked at this way, life can be considered as a computation that aims to optimize the storage and use of meaningful information. And life turns out to be extremely good at it.
This correlation between an organism and its environment is reminiscent of the structural coupling introduced by biologist H.R. Maturana which he characterizes in this way: “The relation between a living system and the medium in which it exists is a structural one in which living system and medium change together congruently as long as they remain in recurrent interactions.”
And these ideas do not dismiss the notion of natural selection. Natural selection is just seen as largely concerned with minimizing the cost of computation. The implications of this perspective are compelling. Jeremy England at the Massachusetts Institute of Technology has applied this notion of adaptation to complex, nonliving systems as well.
Complex systems tend to settle into these well-adapted states with surprising ease, said England: “Thermally fluctuating matter often gets spontaneously beaten into shapes that are good at absorbing work from the time-varying environment.”
Working from the perspective of a general physical principle –
If replication is present, then natural selection becomes the route by which systems acquire the ability to absorb work — Schrödinger’s negative entropy — from the environment. Self-replication is, in fact, an especially good mechanism for stabilizing complex systems, and so it’s no surprise that this is what biology uses. But in the nonliving world where replication doesn’t usually happen, the well-adapted dissipative structures tend to be ones that are highly organized, like sand ripples and dunes crystallizing from the random dance of windblown sand. Looked at this way, Darwinian evolution can be regarded as a specific instance of a more general physical principle governing nonequilibrium systems.
This is an interdisciplinary effort that brings to mind a paper by Virginia Chaitin which I discussed in another post. The kind of interdisciplinary work that Chaitin is describing, involves the adoption of a new conceptual framework, borrowing the very way that understanding is defined within a particular discipline, as well as the way it is explored and the way it is expressed in that discipline. Here we have the confluence of thermodynamics and Darwinian evolution made possible with the mathematical study of information. And I would caution readers of these ideas not to assume that the direction taken by this research reduces life to the physical laws of interactions. It may look that way at first glance. But I would suggest that the direction these ideas are taking is more likely to lead to a broader definition of life. In fact there was a moment when I thought I heard the echo of Leibniz’s monads.
You’d expect natural selection to favor organisms that use energy efficiently. But even individual biomolecular devices like the pumps and motors in our cells should, in some important way, learn from the past to anticipate the future. To acquire their remarkable efficiency, Still said, these devices must “implicitly construct concise representations of the world they have encountered so far, enabling them to anticipate what’s to come.”
It’s not possible to do any justice to the nature of the fundamental, living, yet non-material substance that Leibniz called monads, but I can, at the very least, point to a a few things about them. Monads exist as varying states of perception (though not necessarily conscious perceptions). And perceptions in this sense can be thought of as representations or expressions of the world or, perhaps, as information. He describes a heirarchy of functionality among them. Ones mind, for example, has clearer perceptions (or representations) than those contained in the monads that make up other parts of the body. But, being a more dominant monad, ones mind contains ‘the reasons’ for what happens in the rest of the body. And here’s the idea that came to mind in the context of this article. An individual organ contains ‘the reasons’ for what happens in its cells, and a cell contains ‘the reasons’ for what happens in its organelles. The cell has its own perceptions or representations. I don’t have a way to precisely define ‘the reasons,’ but like the information-driven states of nonequilibrium being considered by physicists, biologists, and mathematicians, this view of things spreads life out.
I could go on for quite some time about the difference between dreaming and being awake. I could see myself picking carefully through every thought I have ever had about the significance of dreams, and I know I would end up with a proliferation of questions, rather than a clarification of anything. But I think that this is how it should be. We understand so little about our awareness, our consciousness, and what cognitive processes actually produce for us. All of this comes to mind, at the moment, because I just read an article in Plus Magazine based on a press conference given by Andrew Wiles at the Heidelberg Laureate Forum in September 2016. Wiles, of course, is famous for having proved Fermat’s Last Theorem in 1995. The article highlights some of Wiles’ thoughts about what it’s like to do mathematics, and what it feels like when he’s doing it. When asked about whether he could feel when a mathematical investigation was headed in the right direction, or when things were beginning to ‘harmonize,’ he said
Yes, absolutely. When you get it, it’s like the difference between dreaming and being awake.
If I had the opportunity, I would ask him to explain this a bit because the relationship between dream sensations and waking sensations has always been interesting to me. The relationship between language and brain imagery, for example, is intriguing. I once had a dream in which someone very close to me looked transparent. I could see through him, and I actually said those words in the dream. I would learn, in due time, that this person was not entirely who he appeared to be. But what Wiles seems to be addressing is the clarity and the certainty of being awake, of opening ones eyes, in contrast to the sometimes enigmatic narrative of a dream. This is what it feels like when you begin to find an idea.
Wiles also had a refreshingly simple response to a question about whether mathematics is invented or discovered:
To tell you the truth, I don’t think I know a mathematician who doesn’t think that it’s discovered. So we’re all on one side, I think. In some sense perhaps the proofs are created because they’re more fallible and there are many options, but certainly in terms of the actual things we find we just think of it as discovered.
I’m not sure if the next question in the article was meant as a challenge to what Wiles believes about mathematical discovery, but it seems posed to suggest that the belief held by mathematicians that they are discovering things is a necessary illusion, something they need to believe in order to do the work they’re doing. And to this possibility Wiles says,
I wouldn’t like to say it’s modesty but somehow you find this thing and suddenly you see the beauty of this landscape and you just feel it’s been there all along. You don’t feel it wasn’t there before you saw it, it’s like your eyes are opened and you see it. (emphasis added)
And this is the key I think, “it’s like your eyes are opened and you see it.” Cognitive neuroscientists involved in understanding vision have described the physical things we see as ‘inventions’ of the visual brain. This is because what we see is pieced together from the visual attributes of objects we perceive (shape, color, movement, etc.), attributes processed by particular cells, together with what looks like the computation of probabilities based on previous visual experience. I believe that questions about how the brain organizes sensation, and questions about what it is that the mathematician explores, are undoubtedly related. Trying to describe the sensation of ‘looking’ in mathematics (as opposed to the formal reasoning that is finally written down) Wiles says this:
…it’s extremely creative. We’re coming up with some completely unexpected patterns, either in our reasoning or in the results. Yes, to communicate it to others we have to make it very formal and very logical. But we don’t create it that way, we don’t think that way. We’re not automatons. We have developed a kind of feel for how it should fit together and we’re trying to feel, “Well, this is important, I haven’t used this, I want to try and think of some new way of interpreting this so that I can put it into the equation,” and so on.
I think it’s important to note that Wiles is telling us that the research mathematician will come up with some completely unexpected patterns in either their reasoning or their results. The unexpected patterns in the results are what everyone gets to see. But that one would find unexpected patterns in ones reasoning is particularly interesting. And clearly the reasoning and the results are intimately tied.
Like the sound that is produced from the numbers associated with the marks on a page of music, there is the perceived layer of mathematics about which mathematicians are passionate. And this is the thing about which it is very difficult to speak. Yet the power of what this perceived layer is may only be hinted at by the proliferation of applications of mathematical ideas in every area of our lives.
Best wishes for the New Year!
Roger Antonsen came to my attention with a TED talk recorded in 2015 that was posted in November. Characterized by the statement, “Math is the hidden secret to understanding the world,” it piqued my curiosity. Antonsen is an associate professor in the Department of Informatics at the University of Oslo. Informatics has been defined as the science of information and computer information systems but its broad reach appears to be related to the proliferation of ideas in computer science, physics, and biology that spring from information-based theories. The American Medical Informatics Association (AMIA) describes the science of informatics as:
…inherently interdisciplinary, drawing on (and contributing to) a large number of other component fields, including computer science, decision science, information science, management science, cognitive science, and organizational theory.
Antonsen describes himself as a logician, mathematician, and computer scientist, with research interests in proof theory, complexity theory, automata, combinatorics and the philosophy of mathematics. His talk, however, was focused on how mathematics reflects the essence of understanding, where mathematics is defined as the science of patterns, and the essence of understanding is defined as the ability to change one’s perspective. In this context, pattern is taken to be connected structure or observed regularity. But Antonsen highlights the important fact that mathematics assigns a language to these patterns. In mathematics, patterns are captured in a symbolic language and equivalences show us the relationship between two points of view. Equalities, Antonsen explains, show us ‘different perspectives’ on the same thing.
In his exploration of the many ways to represent concepts, Antonsen’s talk brought questions to mind that I think are important and intriguing. For example. what is our relationship to these patterns, some of which are ubiquitous? How is it that mathematics finds them? What causes them to emerge in the purely abstract, introspective world of the mathematician? Using numbers, graphics, codes, and animated computer graphics, he demonstrated, for example, the many representations of 4/3. And after one of those demonstrations, he received unexpected applause. The animation showed two circles with equal radii, and a point rotating clockwise on each of the circles but at different rates – one moved exactly 4/3 times as fast as the other. The circles lined up along a diagonal. The rotating point on each circumference was then connected to a line whose endpoint was another dot. The movement of this third dot looked like it was just dancing around until we were shown that it was tracing a pleasing pattern. The audience was clearly pleased with this visual surprise. (FYI, this particular demonstration happens about eight and a half minutes in) Antonsen didn’t expect the applause, and added quickly that what he had shown was not new, that it was known. He explains in his footnotes:
This is called a Lissajous curve and can be created in many different ways, for example with a harmonograph.
These curves emerge from periodicity, like the curves for sine and cosine functions and their related unit circle expressions. In fact it is the difference in the periods described by the rotation of the point around each of Antonsen’s circles (a period being one full rotation) that produced the curve that so pleased his audience.
I believe that Antonsen wanted to make the point that because mathematics brings a language to all of the patterns that emerge from sensation, and because it is driven by the directive that there is always value in finding new points of view, mathematics is a kind of beacon for understanding, everything. About this I wholeheartedly agree. I have always found comfort in the hopefulness associated with finding another point of view, and the powerful presence of this drive in mathematics may be the root of what captivates me about it. Mathematics makes very clear that there is no limit to the possibilities for creative and careful thought.
But I also think that the way Antonsen’s audience enjoyed a very mathematical thing deserves some comment. They didn’t see the mathematics, but they saw one of the things that the mathematics is about – a shape, a pattern, that emerges from relationship. And their impulse was to applaud. This tells us something about what we are not accomplishing in most of our math classes.
When I read the subheading in a recent Scientific American article, it brought me back to some 18th century thoughts which I recently reviewed. The subheading of a piece by Clara Moskowitz’s that describes a new effort in theoretical physics reads:
Hundreds of researchers in a collaborative project called “It from Qubit” say space and time may spring up from the quantum entanglement of tiny bits of information
This sounds like our physical world emerged from interactions among things that are not physical – namely tiny bits of information. And it reminded me of the logical and physical constraints that led Wilhelm Gottfried Leibniz to his view that the fundamental substance of the universe is more like a mathematical point than a tiny particle. Leibniz’s analysis of the physical world rested, not on measurement, but on mathematical thought. He rejected the widely accepted belief that all matter was an arrangement of indivisible, fundamental materials, like atoms. Atoms would be hard, Leibniz argued, and so collisions between atoms would be abrupt, resulting in discontinuous changes in nature. The absence of abrupt changes in nature indicated to him that all matter, regardless of how small, possessed some elasticity. Since elasticity required parts, Leibniz concluded that all material objects must be compounds, amalgams of some sort. Then the ultimate constituents of the world, in order to be simple and indivisible, must be without extension or without dimension, like a mathematical point. For Leibniz, the universe of extended matter is actually a consequence of these simple non-material substances.
This is not exactly the direction being taken by the physicists in Moskowitz’s article, but there is something that these views, separated by centuries, share. And while Moskowitz doesn’t do a lot to clarify the nature of quantum information, I believe the article addresses important shifts in the strategies of theoretical physicists.
The notion that spacetime has bits or is “made up” of anything is a departure from the traditional picture according to general relativity. According to the new view, spacetime, rather than being fundamental, might “emerge” via the interactions of such bits. What, exactly, are these bits made of and what kind of information do they contain? Scientists do not know. Yet intriguingly, “what matters are the relationships” between the bits more than the bits themselves, says IfQ collaborator Brian Swingle, a postdoc at Stanford University. “These collective relationships are the source of the richness. Here the crucial thing is not the constituents but the way they organize together.”
In discussions of his own work on Constructor Theory, David Deutsch often corrects the somewhat self-centered view, born of our experience with words and ideas, that information is not physical. In a piece I wrote about Deutsch’s work, the nature of information is underscored.
Information is “instantiated in radically different physical objects that obey different laws of physics.” In other words, information becomes represented by an instance, or an occurrence, like the attribute of a creature determined by the information in its DNA…Constructor theory is meant to get at what Deutsch calls this “substrate independence of information,” which necessarily involves a more fundamental level of physics than particles, waves and space-time. And he suspects that this ‘more fundamental level’ may be shared by all physical systems.
This move toward information-based physical theories will likely break some of our habits of thought, unveil the prejudice in our perspectives, that have developed over the course of our scientific successes. New understanding requires some struggle with the very way that we think and organize our world. And wrestling with the nature of information, what it is and what it does, has the potential to be very useful in clearing new paths.
Because the project involves both the science of quantum computers and the study of spacetime and general relativity, it brings together two groups of researchers who do not usually tend to collaborate: quantum information scientists on one hand and high-energy physicists and string theorists on the other. “It marries together two traditionally different fields: how information is stored in quantum things and how information is stored in space and time,” says Vijay Balasubramanian, a physicist at the University of Pennsylvania who is an IfQ principal investigator.
In his 2008 Provisional Manifesto Giulio Tononi finds experience to be the mathematical shape taken on by integrated information. He proposes a way to characterize experience using a geometry that describes informational relationships. One could say he proposes, essentially, a model for describing conscious experience. But Tononi himself blurs this distinction between the model and the reality when he writes that these shapes are:
…often morphing smoothly into another shape as new informational relationships are specified through its mechanisms entering new states. Of course, we cannot dream of visualizing such shapes as qualia diagrams (we have a hard time with shapes generated by three elements). And yet, from a different perspective, we see and hear such shapes all the time, from the inside, as it were, since such shapes are actually the stuff our dreams are made of— indeed the stuff all experience is made of.
These things don’t make some common sense, and there is some resistance to all of them. But it is that ‘common sense’ that contains all of our thinking and perceiving habits, all of our prejudices. Neuroscientist Christof Koch is a proponent of Tononi’s theory of consciousness which implies that there is some level of consciousness in everything. And here’s an example of the resistance from John Horgan’s blog Cross Check
That brings me to arguably the most significant development of the last two decades of research on the mind-body problem: Koch, who in 1994 resisted the old Chalmers information conjecture, has embraced integrated information theory and its corollary, panpsychism. Koch has suggested that even a proton might possess a smidgeon of proto-consciousness. I equate the promotion of panpsychism by Koch, Tononi, Chalmers and other prominent mind-theorists to the promotion of multiverse theories by leading physicists. These are signs of desperation, not progress.
I couldn’t disagree more.
An article published in May in Quanta Magazine had the following remark as its lead:
A surprising new proof is helping to connect the mathematics of infinity to the physical world.
My first thought was that the mathematics of infinity is already connected to the physical world. But Natalie Wolchover’s opening few paragraphs were inviting:
With a surprising new proof, two young mathematicians have found a bridge across the finite-infinite divide, helping at the same time to map this strange boundary.
The boundary does not pass between some huge finite number and the next, infinitely large one. Rather, it separates two kinds of mathematical statements: “finitistic” ones, which can be proved without invoking the concept of infinity, and “infinitistic” ones, which rest on the assumption — not evident in nature — that infinite objects exist.
Mapping and understanding this division is “at the heart of mathematical logic,” said Theodore Slaman, a professor of mathematics at the University of California, Berkeley. This endeavor leads directly to questions of mathematical objectivity, the meaning of infinity and the relationship between mathematics and physical reality.
It is becoming increasingly clear to me that harmonizing the finite and the infinite has been an almost ever-present human enterprise, at least as old as the earliest mythical descriptions of the worlds we expected to find beyond the boundaries of the day-to-day, worlds that were below us or above us, but not confined, not finite. I have always been provoked by the fact that mathematics found greater precision with the use of the notion of infinity, particularly in the more concept-driven mathematics of the 19th century, in real analysis and complex analysis. Understanding infinities within these conceptual systems cleared productive paths in the imagination. These systems of thought are at the root of modern physical theories. Infinite dimensional spaces extend geometry and allow topology. And finding the infinite perimeters of fractals certainly provides some reconciliation of the infinite and the finite, with the added benefit of ushering in new science.
Within mathematics, the questionable divide between the infinite and the finite seems to be most significant to mathematical logic. Wolchover’s article addresses work related to Ramsey theory, a mathematical study of order in combinatorial mathematics, a branch of mathematics concerned with countable, discrete structures. It is the relationship of a Ramsey theorem to a system of logic whose starting assumptions may or may not include infinity that sets the stage for its bridging potential. While the theorem in question is a statement about infinite objects, it has been found to be reducible to the finite, being equivalent in strength to a system of logic that does not rely on infinity.
Wolchover published another piece about disputes among mathematicians about the nature of infinity that was reproduced in Scientific American in December 2013. The dispute reported on here has to do with a choice between two systems of axioms.
According to the researchers, choosing between the candidates boils down to a question about the purpose of logical axioms and the nature of mathematics itself. Are axioms supposed to be the grains of truth that yield the most pristine mathematical universe? … Or is the point to find the most fruitful seeds of mathematical discovery…
Grains of truth or seeds of discovery, this is a fairly interesting and, I would add, unexpected choice for mathematics to have to make. The dispute in its entirety says something intriguing about us, not just about mathematics. The complexity of the questions surrounding the value and integrity of infinity, together with the history of infinite notions is well worth exploring, and I hope to do more.
Random is often the word chosen to describe something that has no order. But randomness has become an increasingly useful tool in mathematics, a discipline whose meaningfulness relies, primarily, on order.
In statistics, randomness as a measure of uncertainty, makes possible the identification of events, whether sociopolitical or physical, with the use of probability distributions. We use random sampling to create tools that reduce our uncertainty about whether something has actually happened or not. In information theory, entropy quantifies uncertainty, and makes the analysis of information, in the broadest sense, possible. In algorithmic information theory randomness helps us quantify complexity. Randomness characterizes the emergence of certain kinds of fractals found in nature and even the action of organisms. It has been used to explore neural networks, both natural and artificial. Researchers, for example, have explored the use of a chaotic system in a machine that might then have properties important to brain-like learning, adaptability and flexibility. Gregory Chaitin’s metabiology, outlined in his book Proving Darwin: Making Biology Mathematical, investigates the random evolution of artificial software that might provide insight into the random evolution of natural software (DNA).
Quanta Magazine recently published a piece with the title, A Unified Theory of Randomness, in which Kevin Hartnett describes the work of MIT professor of mathematics Scott Sheffield, who investigates the properties of shapes that are created by random processes. These are shapes that occur naturally in the world but, until now, appeared to have only their randomness in common.
Yet in work over the past few years, Sheffield and his frequent collaborator, Jason Miller, a professor at the University of Cambridge, have shown that these random shapes can be categorized into various classes, that these classes have distinct properties of their own, and that some kinds of random objects have surprisingly clear connections with other kinds of random objects. Their work forms the beginning of a unified theory of geometric randomness.
“You take the most natural objects — trees, paths, surfaces — and you show they’re all related to each other,” Sheffield said. “And once you have these relationships, you can prove all sorts of new theorems you couldn’t prove before.”
In the coming months, Sheffield and Miller will publish the final part of a three-paper series that for the first time provides a comprehensive view of random two-dimensional surfaces — an achievement not unlike the Euclidean mapping of the plane.
The article is fairly thorough in making the meaning of these advances accessible. But with limited time and space, I’ll just highlight a few things:
In this ‘random geometry,’ if the location of some of the points of a randomly generated object are known, probabilities are assigned to subsequent points. As it turns out, certain probability measures arise in many different contexts. This contributes to the identification of classes and properties, critical to growth in mathematics.
We can all imagine random motion, or random paths, but here the random surface is explored. As Hartnett tells us,
Brownian motion is the “scaling limit” of random walks — if you consider a random walk where each step size is very small, and the amount of time between steps is also very small, these random paths look more and more like Brownian motion. It’s the shape that almost all random walks converge to over time.
Two-dimensional random spaces, in contrast, first preoccupied physicists as they tried to understand the structure of the universe.
Sheffield was interested in finding a Brownian motion for surfaces. And two ideas that already existed would help lead him. Physicists have a way of describing a random surface, whose surface area could be determined (related to quantum gravity). There is also something called a Brownian map, whose structure allows the calculation of distance between points. But the two could not be shown to be related. If there was a way to measure distance on former structure, it could be compared to distances measured on the latter. Their hunch was that these two surfaces were different perspectives on the same object. To overcome the difficulty of distance measurement on the former, they used growth over time as a distance metric.
…as Sheffield and Miller were soon to learn, “[random growth] becomes easier to understand on a random surface than on a smooth surface,” said Sheffield. The randomness in the growth model speaks, in a sense, the same language as the randomness on the surface on which the growth model proceeds. “You add a crazy growth model on a crazy surface, but somehow in some ways it actually makes your life better,” he said.
But they needed another trick to model growth on very random surfaces in order to establish a distance structure equivalent to the one on the (very random) Brownian map. They found it in a curve.
Sheffield and Miller’s clever trick is based on a special type of random one-dimensional curve that is similar to the random walk except that it never crosses itself. Physicists had encountered these kinds of curves for a long time in situations where, for instance, they were studying the boundary between clusters of particles with positive and negative spin (the boundary line between the clusters of particles is a one-dimensional path that never crosses itself and takes shape randomly). They knew these kinds of random, noncrossing paths occurred in nature, just as Robert Brown had observed that random crossing paths occurred in nature, but they didn’t know how to think about them in any kind of precise way. In 1999 Oded Schramm, who at the time was at Microsoft Research in Redmond, Washington, introduced the SLE curve (for Schramm-Loewner evolution) as the canonical noncrossing random curve.
Popular opinion often finds fault in attempts to quantify everything, as if quantification is necessarily diminishing of things. What strikes me today is that quantification is more the means to finding structure. But it is the integrity of those structures that consistently unearths surprises. The work described here is a beautiful blend of ideas that bring new depth to the value of geometric perspectives.
“It’s like you’re in a mountain with three different caves. One has iron, one has gold, one has copper — suddenly you find a way to link all three of these caves together,” said Sheffield. “Now you have all these different elements you can build things with and can combine them to produce all sorts of things you couldn’t build before.”
First, I would like to apologize for posting so infrequently these past few months. I have been working hard to flesh out a book proposal closely related to the perspective of this blog, and I will be focused on this project for a bit longer.
However, a TED talk filmed in Paris in May came to my attention today. The talk was given by Blaise Agüera y Arcas who works on machine learning at Google. It was centered on illustrating the intimate connection between perception and creativity. Agüera y Arcas surveyed the history of neuroscience a bit, as well as the birth of machine learning, and the significance of neural networks. This was the message that caught my attention.
In this captivating demo, he shows how neural nets trained to recognize images can be run in reverse, to generate them.
There must be a significant insight here. The images produced when neural nets are run in reverse are very interesting. They are full of unexpected yet familiar abstractions. One of the things I found particularly interesting, however, was how Agüera y Arcas described the reversal of the recognition process. He first drew us a picture of a neural network involved in recognizing or naming an image, specifically, a first layer of neurons (pixels in an image or neurons in the retina) that feed forward to subsequent layers, connected by synapses with varying strengths, that govern the computations that end in the identification or the word for the image. He then suggested representing those things – the input pixels, the synapses, and the final identification – with three variables x, w, and y respectively. He reminded us, there could be a million x values, billions or trillions of w values and a small number of y values. But put in relationship, they resemble an equation with one unknown (namely the y) – the name of the object to be found. If x and y are known, finding w is a learning process:
So this process of learning, of solving for w, if we were doing this with the simple equation in which we think about these as numbers, we know exactly how to do that: 6 = 2 x w, well, we divide by two and we’re done. The problem is with this operator. So, division — we’ve used division because it’s the inverse to multiplication, but as I’ve just said, the multiplication is a bit of a lie here. This is a very, very complicated, very non-linear operation; it has no inverse. So we have to figure out a way to solve the equation without a division operator. And the way to do that is fairly straightforward. You just say, let’s play a little algebra trick, and move the six over to the right-hand side of the equation. Now, we’re still using multiplication. And that zero — let’s think about it as an error. In other words, if we’ve solved for w the right way, then the error will be zero. And if we haven’t gotten it quite right, the error will be greater than zero.
So now we can just take guesses to minimize the error, and that’s the sort of thing computers are very good at. So you’ve taken an initial guess: what if w = 0? Well, then the error is 6. What if w = 1? The error is 4. And then the computer can sort of play Marco Polo, and drive down the error close to zero. As it does that, it’s getting successive approximations to w. Typically, it never quite gets there, but after about a dozen steps, we’re up to w = 2.999, which is close enough. And this is the learning process.
…It’s exactly the same way that we do our own learning. We have many, many images as babies and we get told, “This is a bird; this is not a bird.” And over time, through iteration, we solve for w, we solve for those neural connections.
The interesting thing happens when you solve for x.
And about a year ago, Alex Mordvintsev, on our team, decided to experiment with what happens if we try solving for x, given a known w and a known y. In other words, you know that it’s a bird, and you already have your neural network that you’ve trained on birds, but what is the picture of a bird? It turns out that by using exactly the same error-minimization procedure, one can do that with the network trained to recognize birds, and the result turns out to be … a picture of birds. So this is a picture of birds generated entirely by a neural network that was trained to recognize birds, just by solving for x rather than solving for y, and doing that iteratively.
All of the images displayed are really interesting. And there are multilayered references to mathematics here: the design of neural networks, conceptualizing and illustrating what it means to ‘run in reverse,’ and even the form the abstractions take in the images produced (which are shown in the talk). Many of the images are Escher-like. It’s definitely worth a look.
In the end, Agüera y Arcas makes the point that computing, fundamentally, has always involved modeling our minds in some way. And the extraordinary progress that has been made in computing power and machine intelligence “gives us both the ability to understand our own minds better and to extend them.” In this effort we get a fairly specific view of what seems to be one of the elements of creativity. This will continue to highlight the significance of mathematics in our ongoing quest to understand ourselves.
In 2011 John Horgan posted a piece on his blog, Cross Check (part of the Scientific American blog network), with the title, Why Information can’t be the basis of reality. There Horgan makes the observation that the “everything-is-information meme violates common sense.” As of last December (at least) he hadn’t changed his mind. He referred back to the information piece in a subsequent post that was, essentially, a critique of Guilio Tononi’s Integrated Information Theory of Consciousness which had been the focus of a workshop that Horgan attended at New York University last November. In that December post Horgan quotes himself making the following argument:
The concept of information makes no sense in the absence of something to be informed—that is, a conscious observer capable of choice, or free will (sorry, I can’t help it, free will is an obsession). If all the humans in the world vanished tomorrow, all the information would vanish, too. Lacking minds to surprise and change, books and televisions and computers would be as dumb as stumps and stones. This fact may seem crushingly obvious, but it seems to be overlooked by many information enthusiasts. The idea that mind is as fundamental as matter—which Wheeler’s “participatory universe” notion implies–also flies in the face of everyday experience. Matter can clearly exist without mind, but where do we see mind existing without matter? Shoot a man through the heart, and his mind vanishes while his matter persists.
What is being overlooked here, however, are the subtleties in a growing, and consistently shifting perspective on information itself. More precisely, what is being overlooked is what information enthusiasts understand information to be and how it can be seen acting in the world around us. Information is no longer defined only through the lens of human-centered learning. But it is promising, as I see it, that information, as it is currently understood, includes human-centered learning and perception. The slow and steady movement toward a reappraisal of what we mean by information inevitably begins with Claude Shannon who in 1948 published The Mathematical Theory of Communication in Bell Labs’ Technical journal. Shannon saw that transmitted messages could be encoded with just two bursts of voltage – an on burst and an off burst, or 0 and 1 – which immediately improved the integrity of transmissions. But, of even greater significance, this binary code made the mathematical framework that could measure the information in a message possible. This measure is known as Shannon’s entropy, as it mirrors the definition of entropy in statistical mechanics which is a statistical measure of thermodynamic entropy. Aloka Jha does a nice job of describing the significance of Shannon’s work in a piece he wrote for The Guardian.
In a Physics Today article physicists Eric Lutz and Sergio Ciliberto begin a discussion of a quirk in the second law of thermodynamics (known as Maxwell’s demon) in this way:
Almost 25 years ago, Rolf Landauer argued in the pages of this magazine that information is physical (see PHYSICS TODAY, May 1991, page 23). It is stored in physical systems such as books and memory sticks, transmitted by physical means – for instance, via electrical or optical signals – and processed in physical devices. Therefore, he concluded, it must obey the laws of physics, in particular the laws of thermodynamics.
But Maxwell’s demon messes with the second law of thermodynamics. It’s the product of a thought experiment involving a hypothetical, intelligent creature imagined by physicist James Clark Maxwell in 1867. The creature introduces the possibility that the Second Law of Thermodynamics could be violated because of what he ‘knows.’ Lisa Zyga describes Maxwell’s thought experiment nicely in a phys.org piece that reports on related findings:
In the original thought experiment, a demon stands between two boxes of gas particles. At first, the average energy (or speed) of gas molecules in each box is the same. But the demon can open a tiny door in the wall between the boxes, measure the energy of each gas particle that floats toward the door, and only allow high-energy particles to pass through one way and low-energy particles to pass through the other way. Over time, one box gains a higher average energy than the other, which creates a pressure difference. The resulting pushing force can then be used to do work. It appears as if the demon has extracted work from the system, even though the system was initially in equilibrium at a single temperature, in violation of the second law of thermodynamics.
I’m guessing that Horgan would find this consideration foolish. But Maxwell didn’t.
And I would like to suggest that this is because a physical law is not something that is expected to hold true only from our perspective. Rather, it should be impossible to violate a physical law. But it has now become possible to test Maxwell’s concern in the lab. And recent experiments shed light, not only on the law, but also how one can understand the nature of information. While all of the articles or papers referenced in this post are concerned with Maxwell’s demon, what they inevitably address is a more precise and deeper understanding of the nature and physicality of what we call information.
On 30 December 2015, Physical Review Letters published a paper that presents an experimental realization of “an autonomous Maxwell’s demon.” Theoretical physicist Sebastian Deffner. who wrote a companion piece for that paper, does a nice history of the problem.
Maxwell’s demon was an instant source of fascination and led to many important results, including the development of a thermodynamic theory of information. But a particularly important insight came in the 1960s from the IBM researcher Rolf Landauer. He realized that the extra work that can be extracted from the demon’s action has a cost that has to be “paid” outside the gas-plus-demon system. Specifically, if the demon’s memory is finite, it will eventually overflow because of the accumulated information that has to be collected about each particle’s speed. At this point, the demon’s memory has to be erased for the demon to continue operating—an action that requires work. This work is exactly equal to the work that can be extracted by the demon’s sorting of hot and cold particles. Properly accounting for this work recovers the validity of the second law. In essence, Landauer’s principle means that “information is physical.” But it doesn’t remove all metaphysical entities nor does it provide a recipe for building a demon. For instance, it is fair to ask: Who or what erases the demon’s memory? Do we need to consider an über-demon acting on the demon?
About eighty years ago, physicist Leo Szilard proposed that it was possible to replace the human-like intelligence that Maxwell had described with autonomous, possibly mechanical, systems that would act like the demon but fully obey the laws of physics. A team of physicists in Finland led by Jukka Pekola did that last year.
According to Deffner,
The researchers showed that the demon’s actions make the system’s temperature drop and the demon’s temperature rise, in agreement with the predictions of a simple theoretical model. The temperature change is determined by the so-called mutual information between the system and demon. This quantity characterizes the degree of correlation between the system and demon; or, in simple terms, how much the demon “knows” about the system.
We now have an experimental system that fully agrees with our simple intuition—namely that information can be used to extract more work than seemingly permitted by the original formulations of the second law. This doesn’t mean that the second law is breakable, but rather that physicists need to find a way to carefully formulate it to describe specific situations. In the case of Maxwell’s demon, for example, some of the entropy production has to be identified with the information gained by the demon. The Aalto University team’s experiment also opens a new avenue of research by explicitly showing that autonomous demons can exist and are not just theoretical exercises.
Earlier results (April 2015) were published by Takahiro Sagawa and colleagues. They created the realization of what they called an information heat engine – their version of the demon.
Due to the advancements in the theories of statistical physics and information science, it is now understood that the demon is indeed consistent with the second law if we take into account the role of information in thermodynamics. Moreover, it has been recognized that the demon plays the key role to construct a unified theory of information and thermodynamics. From the modern point of view, the demon is regarded as a feedback controller that can use the obtained information as a resource of the work or the free energy. Such an engine controlled by the demon can be called an information heat engine.
From Lisa Zyga again:
Now in a new paper, physicists have reported what they believe is the first photonic implementation of Maxwell’s demon, by showing that measurements made on two light beams can be used to create an energy imbalance between the beams, from which work can be extracted. One of the interesting things about this experiment is that the extracted work can then be used to charge a battery, providing direct evidence of the “demon’s” activity.
Physicist, Mihai D. Vidrighin and colleagues carried out the experiment at the University of Oxford. Published results are found in a recent issue of Physical Review Letters.
All of these efforts are illustrations of the details needed to demonstrate that information can act on a system, that information needs to be understood in physical terms, and this refreshed view of information inevitably addresses our view of ourselves.
These things and more were brought to bear on a lecture given by Max Tegmark in November 2013 (which took place before some of the results cited here) with the title Thermodynamics, Information and Concsiousness in a Quantum Multiverse. The talk was very encouraging. One of his first slides said:
I think that consciousness is the way information feels when being processed in certain complex ways.
“What’s the fuss about entropy?” he asks. And the answer is that entropy is one of the things crucial to a useful interpretation of quantum mechanical facts. This lecture is fairly broad in scope. From Shannon’s entropy to decoherence, Tegmark explores meaning mathematically. ‘Observation’ is redefined. An observer can be a human observer or a particle of light (like the demons designed in the experiments thus far described). Clearly Maxwell had some intuition about the significance of information and observation when he first described his demon.
Tegmark’s lecture makes the nuances of meaning in physics and mathematics clear. And this is what is overlooked in Horgan’s criticism of the information-is-everything meme. And Tegmark is clearly invested in understanding all of nature, as he says – the stuff we’re looking at, the stuff we’re not looking at, and our own mind. The role that mathematics is playing in the definition of information is certainly mediating this unity. And Tegmark rightly argues that only if we rid ourselves of the duality that separates the mind from everything else can we find a deeper understanding of quantum mechanics, the emergence of the classical world, or even what measurement actually is.
I just listened to a talk given by Virginia Chaitin that can be found on academia.edu. The title of the talk is A philosophical perspective on a metatheory of biological evolution. In it she outlines Gregory Chaitin’s work on metabiology, which has been the subject of some of my previous posts – here, here, and here. But, since the emphasis of the talk is on the philosophical implications of the theory, I became particularly aware of what metabiology may be saying about mathematics. It is, after all, the mathematics that effects the paradigm shifts that bring about alternative philosophies.
Metabiology develops as a way to answer the question of whether or not one can prove mathematically that evolution, through random mutations and natural selection, is capable of producing the diversity of life forms that exist today. This, in itself, shines the light on mathematics. Proof is a mathematical idea. And in the Preface of Gregory Chaitin’s Proving Darwin he articulates the bigger idea:
The purpose of this book is to lay bare the deep inner mathematical structure of biology, to show life’s hidden mathematical core. (emphasis my own)
And so it would seem clear that metabiology is as much about mathematics as it is about biology, perhaps more so.
Almost every discussion of metabiology addresses some of its philosophical implications, but there are points made in this talk that speak more directly to mathematics’ role in metabiology’s paradigm shifts. For example, Chaitin (Virginia) begins by stressing that metabiology makes different use of mathematics. By this she means that mathematics is not being used to model evolution, but to explore it, crack it open. Results in mathematics suggest a view that removes some of the habitual thoughts associated with what we think we see. She explains that this is possible because metabiology takes advantage of aspects of mathematics that are not widely known or taught – like its logical irreducibility and quasi-experimental nature. She also explains that exploratory strategies can combine or interweave computable and uncomputable steps. And so mathematics here is not being used ‘instrumentally,’ but as a way to express the creativity of evolution by way of its own creative nature. These strategies are some of the consequences of Gödel’s and Turing’s insights.
The fact that metabiology relies so heavily on a post-Gödel and post-Turing understanding of mathematics and computability, puts a spotlight on the depth and significance of these insights, and perhaps points to some yet to be discovered implications of incompleteness. I continue to find it particularly interesting that while both Gödel’s incompleteness theorems and Turing’s identification of the halting problem look like they are pointing to limitations within their respective disciplines, in metabiology, they each clearly inspire new biological paradigms, that could very well lead to new science. Metabiology affirms that our ideas concerning incompleteness, and uncomputability provide insights into nature as well as mathematics and computation. And so these important results from Gödel and Turing describe, not the limitations of mathematics or computers, but the limitations of a perspective, the limitations of a mechanistic point of view. Proofs, Chaitin tells us, are used in metabiology to generate and express novelty. And this is what nature does.
The talk makes the necessary alignment of biology with metabiology and, for the sake of thoroughness, I’ll repeat them here:
- Biology deals in natural software (DNA and RNA) while metabiological software is a computer program.
- In biology, organisms result from processes involving DNA, RNA and the environment while in metabiology the organism is the software itself.
- In biology evolution increases the sophistication of biological lifeforms, while in metabiology evolution is the increase in the information content of algorithmic life forms.
- The challenge to an organism in nature is to survive and adapt, while the challenge in metabiology is to solve a mathematical problem that requires creative uncomputable steps.
- In metabiology evolution is defined by an increase in the information content of an algorithmic life form, and fitness is understood as the growth of conceptual complexity. These are mathematical ideas, and together they create a lense through which we can view evolution.
In a recent paper on conceptual complexity and algorithmic information theory, Gregory Chaitin defines conceptual complexity in this way:
In this essay we define the conceptual complexity of an object X to be the size in bits of the most compact program for calculating X, presupposing that we have picked as our complexity standard a particular fixed, maximally compact, concise universal programming language U. This is technically known as the algorithmic information content of the object X, denoted Hu(X) or simply H(X) since U is assumed fixed. In medieval terms, H(X) is the minimum number of yes/no decisions that God would have to make to create X.
Biological creativity, here, becomes associated with mathematical creativity and is understood as the generation of novelty, which is further understood as the generation of new information content. Virginia Chaitin also tells us that metabiology proposes a hybrid theory that relies on computability (something that can be understood mechanically) and uncomputability (something that cannot).
It seems to me that the application of incompleteness, uncomputability, and undecidability, in any context, serves to prune the mechanistic habits that have grown over the centuries in the sciences, as well as the habits of logic that are thought to lead to true things.
The need to address these issues can be seen even in economics, as explored in a 2008 paper I happened upon by K. Vela Velupillai presented at an International Conference on Unconventional Computation. The abstract of the paper says this:
Economic theory, game theory and mathematical statistics have all increasingly become algorithmic sciences. Computable Economics, Algorithmic Game Theory [Noam Nisan, Tim Roiughgarden, Éva Tardos, Vijay V. Vazirani (Eds.), Algorithmic Game Theory, Cambridge University Press, Cambridge, 2007] and Algorithmic Statistics [Péter Gács, John T. Tromp, Paul M.B. Vitányi, Algorithmic statistics, IEEE Transactions on Information Theory 47 (6) (2001) 2443–2463] are frontier research subjects. All of them, each in its own way, are underpinned by (classical) recursion theory – and its applied branches, say computational complexity theory or algorithmic information theory – and, occasionally, proof theory. These research paradigms have posed new mathematical and metamathematical questions and, inadvertently, undermined the traditional mathematical foundations of economic theory. A concise, but partial, pathway into these new frontiers is the subject matter of this paper.
Mathematical physicist John Baez writes about computability, uncomputability, logic, probability, and truth in a series of posts found here. They’re worth a look.
A recent issue of New Scientist featured an article by Kate Douglas with the provocative title Nature’s brain: A radical new view of evolution. The limits of our current understanding of evolution, and the alternative view discussed in the article, are summarized in this excerpt:
Any process built purely on random changes has a lot of potential changes to try. So how does natural selection come up with such good solutions to the problem of survival so quickly, given population sizes and the number of generations available?…It seems that, added together, evolution’s simple processes form an intricate learning machine that draws lessons from past successes to improve future performance.
Evolution, as it has been understood, relies on the ideas of variation, selection, and inheritance. But learning uses the past to anticipate the future. Random mutations, on the other hand, are selected by current circumstances. Yet the proposal is that natural selection somehow reuses successful variants from the past. This idea has been given the room to develop, in large part, with the increasing use and broadened development of iterative learning algorithms.
Leslie Valiant, a computational theorist at Harvard University, approached the possibility in his 2013 book, Probably Approximately Correct. There he equated evolution’s action to the learning algorithm known as Bayesian updating.
Richard Watson of the University of Southampton, UK has added a new observation and this is the subject of the New Scientist article. It is that genes do not work independently, they work in concert. They create networks of connections. And a network’s organization is a product of past evolution, since natural selection will reward gene associations that increase fitness. What Watson realized is that the making of connections among genes in evolution, forged in order to produce a fit phenotype, parallels the making of neural networks, or networks of associations built, in the human brain, for problem solving. Watson and his colleagues have been able to go as far as creating a learning model demonstrating that a gene network could make use of generalization when grappling with a problem under the pressure of natural selection.
I can’t help but think of Gregory Chaitin’s random walk through software space, his metabiology, where life is considered evolving software (Proving Darwin). Chiara Marletto’s application of David Deutsch’s constructor theory to biology also comes to mind. Chaitin’s idea is characterized by an algorithmic evolution, Marletto’s by digitally coded information that can act as a constructor, which has what she calls causal power and resiliency.
What I find striking about all of these ideas is the jumping around that mathematics seems to be doing – it’s here, there and everywhere. And, it should be pointed out that these efforts are not just the application of mathematics to a difficult problem. Rather, mathematics is providing a new conceptualization of the problem. It’s reframing the questions as well as the answers. For Chaitin, mathematical creativity is equated with biological creativity. For Deutsch, information is the only independent substrate of everything and, for Marletto, this information-based theory brings biology into fundamental physics.
A NY Times review of Valient’s book, by Edward Frenkel, says this:
The importance of these algorithms in the modern world is common knowledge, of course. But in his insightful new book “Probably Approximately Correct,” the Harvard computer scientist Leslie Valiant goes much further: computation, he says, is and has always been “the dominating force on earth within all its life forms.” Nature speaks in algorithms.
…This is an ambitious proposal, sure to ignite controversy. But what I find so appealing about this discussion, and the book in general, is that Dr. Valiant fearlessly goes to the heart of the “BIG” questions.
That’s what’s going on here. Mathematics is providing the way to precisely explore conceptual analogies to get to the heart of big questions.
I’ll wrap this up with an excerpt from an article by Artuto Carsetti that appeared in the November 2014 issue of Cognitive Processing with the title Life, cognition and metabiology.
Chaitin (2013) is perfectly right to bring the phenomenon of evolution in its natural place which is a place
characterized in a mathematical sense: Nature ‘‘speaks’’ by means of mathematical forms. Life is born from a compromise between creativity and meaning, on the one hand, and, on the other hand, is carried out along the ridges of a specific canalization process that develops in accordance with computational schemes…Hence, the emergence of those particular forms…that are instantiated, for example, by the Fibonacci numbers, by the fractal-like structures etc. that are ubiquitous in Nature. As observers we see these forms but they are at the same time inside us, they pave of themselves our very organs of cognition. (emphasis added)