Computational Mysticism
Exploring the mystical and poetic nature of computation, consciousness, and reality
Computational mysticism is the contemplation of the nature of computation and how it relates to the mystical, the poetic, and the conversational nature of reality and existence.
What does it mean to compute?
What is a computer?
At first glance, these may seem like simple, perhaps even silly questions. But if you spend some more time reflecting more deeply on this term "computing", the whole chain of considerations and things it's connected to ends up getting pretty weird.
The questions of computation inevitably touch on the questions of consciousness, time, language, the limits of knowledge, the nature of reality, and in turn the divine.
I've found that when I meditate on the themes of computing, and on many of the core ideas and areas of study of computer science, I inherently sense them to be mystical and poetic. There is recursion, and infinity, and new angles at the age old questions of philosophy.
This computational poetic perfuses everything.
In this essay series, I want to share with you a broad set of ideas and questions from computer science that I've mused on over the years. My hope is to impart to you some of what I find beautiful, wondrous, and poetic about computation. We'll touch a bit on the history of computing, complexity theory, cryptography and artificial intelligence.
At the intersection of philosophy, computer science, and poetry I've found my own little bit of bliss.
I want to introduce this term: "computational mysticism" as an umbrella category for the more poetic contemplations of computing, and by way of example, to give you a sort of map of things I've come across that fall under this umbrella: a catalog of books, writing and concepts that I've felt inspired by, and that I feel are emblematic of this approach. My hope is to suggest that meditation on these themes, questions, and the mystical nature of computation might make us appreciate just how weird everything really is.
Make it Weird Again
You might ask, why try to reveal or revel in the weirdness of things?
There's something inherently valuable in illuminating the strangeness in what we take as given. By "making it weird again" we can deconstruct the familiar to make it strange to ourselves agin, so that we can approach the commonplace with a fresh curiosity. This perspective helps us see new possibilities and connections we might otherwise miss.
This "make it weird again" approach mirrors what philosophers like Derrida called deconstruction. To break concepts apart until they stop making sense, then rebuilding them with a deeper understanding and nuance.
The path to insight often runs through unfamiliarity.
When we contemplate computing through this lens, we can find metaphors and frameworks for deeper technical understanding and artistic expression. In line with this theme of computational mysticism, I'll share a few short pieces of my own that felt inspired by this theme.
The other reason I think it's worthwhile to look at the weirdness of computing, is that it gives us an appropriate awe, and marvel of reality.
As I started to write this piece, I found there are many different topics I'd love to eventually cover. In the spirit of done is better than perfect, and iterating, I decided to just put up what I had time to write, along with some suggested readings, and hope that I'll get a chance to dive into some of the other topics later
Overview
- A quick meditation on the question "What does it means to compute?"
- We made the sand talk: the strangeness of AI
- A list of suggested readings, and other things I'd love to eventually write about.
What does it mean to compute?
Now that we've talked about why it might be worthwhile to explore the weirdness of computation, let's dive into the central question that drives computational mysticism: What does it actually mean to compute?
This seemingly simple question opens up a philosophical rabbit hole when we begin to examine it closely. As you peel back the layers of this concept, you find ourselves encountering questions about consciousness, representation, and the nature of information itself.
One lens on computing is that it's a way to harness physical processes to "compute" something, or perhaps to "store" some information. We map representations onto physical media. When storing information, we encode a state into physical matter that can be reliably decoded later. When computing, we transform these states through meaningful operations to obtain new forms, like solving an equation. It's interesting to consider that in some sense we have a "recipe" for turning a question encoded in physical matter, into an eventual answer. This enactment of questions and how to work matter to answer them is in and of itself pretty awe inspiring.
Historically, early computers were aids for numerical calculation. The Wikipedia article on "Computer" and "Computing" point to the abacus as an early computational tool. It also highlights Charles Babbage's analytical engine (1837) as the first general-purpose computer. He never succeeded in building this device, but it would have been the first "Turing-complete" system (a concept we'll explore in a future piece). In another piece, I'd love to give a broader survey of the history of computing, because it helps to shed some light on the questions of what we might mean by computing, and how our concept of it has seemed to evolve over the years.
Who is computing?
But as we consider the activity of computing, a deeper question emerges: "Who" is doing the computing? If computation involves representational states used to ask questions and receive answers, doesn't this imply a conscious actor? It is the directed, representational aspect of computing that seems to distinguish it from other physical processes.
Even when we're simply storing information or sending messages, there's still a relationship to consciousness - beings superimposing information states onto physical processes toward some goal, whether communicating with others or with our future selves.
So there's a real question about whether computing makes any sense outside of the concept of conscious actors.
In a broader philosophical sense, computation connects to David Whyte's concept of "the conversational nature of reality" - the dance of consciousness meeting its environment. In a poetic sense, computation becomes our attempt to have a conversation with life, its possibilities, and its meaning.
It's only in the last year or so, that I came across the Wikipedia article on Computation, and the section: Alternative Accounts of Computation, that led me to see that other folks have also contemplated some of these questions. For a deeper dive into the philosophical foundations, I'd recommend The Information: A History, A Theory, A Flood by James Gleick and Gödel, Escher, Bach: An Eternal Golden Braid by Douglas Hofstadter.
What can be a computer?
When we think about computing as the harnessing of physical processes to represent informational states for the sake of a conscious actor, we can begin to ask: what sorts of physical processes might be conducive as mediums for computation - what can be a computer?
In one lens, the laws of physics dictate the limits of computation. They determine how representational states can be encoded in matter, and what operations can be performed to meaningfully modify these representational states.
A large part of computer science is the exploration of what is known as complexity theory and probing the fundamental limits of computation. What sorts of questions are computable?
Complexity theory asks which problems can be solved efficiently and which ones require impractical amounts of time or resources. There's an amazing Quanta Magazine piece on the history of complexity theory that I would recommend as a primer for those interested: The 50-Year Journey to the Limits of Knowledge. For deeper exploration, I'd also recommend Quantum Computing Since Democritus by Scott Aaronson and The Limits of Computation.
This field of complexity theory is both theoretical in nature, but also directly connects to physics, as the physical laws of our universe ultimately dictate the limits of computation, and the sorts of computers that we can build.
Is the universe a computer?
As we start to see the possibility of representing informational states in physical matter, it feels natural to start to raise questions about whether our perceived reality itself might be the experience of some computational process. Is physics and our reality itself a representation of some underlying computational processes?
This concept is often framed as the "Simulation Hypothesis", but connects to deeper ideas about the nature of reality. Simulation Hypothesis.
Concepts like Pancomputationalism and Digital Physics suggest that computation might be more fundamental to reality than we typically assume. This connects to John Wheeler's "Participatory Anthropic Principle" and his famous "it from bit" doctrine, which proposes that "every item of the physical world has at bottom... an immaterial source and explanation. That which we call reality arises in the last analysis from the posing of yes–no questions and the registering of equipment-evoked responses; in short, that all things physical are information-theoretic in origin and that this is a participatory universe"
In his view, we live in a "participatory universe" where observation and information are foundational.
It's hard not to think about this participatory universe, the closely bound up nature of computation, consciousness and informational reality as something to marvel at - worthy of mysticism and wonder.
Earlier in this piece, I referred to computational mysticism as:
the contemplation of the nature of computation and how it relates to the mystical, the poetic, and the conversational nature of reality and existence.
I borrowed the term "the conversational nature of reality" from David Whyte, who is one of my favorite poets and writers. Thought it might not have been his exact meaning, I do think of the nature of computing, as the direction of attention and consciousness on to physical matter - it feels to me like the dance of consciousness in meeting its reality, and in this sense, of trying to have a conversation with life and its possibilities, and meaning.
We made the sand think
Perhaps the most salient and immediate experience of the profoundness of computing, in our present moment, is the advent of scaleable artificial intelligence.
Almost a year ago, Sam Altman shared a piece called The Intelligence Age that had this passage in it:
Here is one narrow way to look at human history: after thousands of years of compounding scientific discovery and technological progress, we have figured out how to melt sand, add some impurities, arrange it with astonishing precision at extraordinarily tiny scale into computer chips, run energy through it, and end up with systems capable of creating increasingly capable artificial intelligence.
This may turn out to be the most consequential fact about all of history so far. It is possible that we will have superintelligence in a few thousand days (!); it may take longer, but I'm confident we'll get there.
How did we get to the doorstep of the next leap in prosperity?
In three words: deep learning worked.
In 15 words: deep learning worked, got predictably better with scale, and we dedicated increasing resources to it.
That's really it; humanity discovered an algorithm that could really, truly learn any distribution of data (or really, the underlying "rules" that produce any distribution of data). To a shocking degree of precision, the more compute and data available, the better it gets at helping people solve hard problems. I find that no matter how much time I spend thinking about this, I can never really internalize how consequential it is.
Given the context of the piece above - the present moment we find ourselves in is probably a particularly good one to marvel at the wonder of computing, and how strange of a facet of our lived experience a principle like deep learning has become.
There was a meme that went around last year about this piece and the broader current of deep learning that "We made the sand think". It's a pretty apt meme to point out the strangeness of the moment - in its essential form it highlights how we've found ways to run electricity through a derivative of sand, manipulating physical matter to simulate intelligence. We've now got products like ChatGPT, we've conjured golems and, perhaps more concerningly - it seems like the principles that constitute the building blocks of these digital intelligences are scaleable, leading to a compounding acceleration of intelligence that should be appropriately awe-inspiring and concerning.
Another strangeness of all this is that the basis of these digital intelligence seems to be some essential aspect of our own languages. These large language models are primarily based on ingesting the vast volumes of digital text and images that we've created over the last few decades, and then inferring the probabilities that one word occurs after another.
It's strange that by making a probability prediction machine, a tool that can predict what word or token occurs after another, something that resembles intelligence emerges. I think Ilya Sutskever, the former CTO of OpenAI and the pioneer of many of these scaled deep learning techniques puts it well in this clip:
The way to think about it is that when we train a large neural network to accurately predict the next word in lots of different texts from the Internet. What we are doing is that we are learning a world model. It looks like we are learning, it may look on the surface that we are just learning statistical correlations in text, but it turns out that to just learn the statistical correlations in text, to compress them really well. What the neural network learns is some representation of the process that produced the text. This text is actually a projection of the world. There is a world out there and it has a projection on this text. And so, what the neural network is learning is more and more aspects of the world, of people, of the human conditions, their hopes, dreams, and motivations, their interactions and the situations that we are in and the neural network learns a compressed, abstract, usable representation of that. This is what's being learned from accurately predicting the next word. And furthermore, the more accurate you are predicting the next word, the higher a fidelity, the more resolution you get in this process.
So that's what the pre-training stage does. But what this does not do is, specify the desired behavior that we wish our neural network to exhibit. You see a language model, what it really tries to do is to answer the following question. If I had some random piece of text on the internet, which starts with some prefix, some prompt, what will it complete to? If you just randomly ended up on some text from the Internet. But this is different from, well, I want to have an assistant which will be truthful, that will be helpful, that will follow certain rules and not violate them, that requires additional training.
This is where the fine tuning and the reinforcement learning from human teachers and other forms of AI assistance. It's not just reinforcement learning from human teachers, it's also reinforcement learning from human and AI collaboration. Our teachers are working together with an AI to teach our AI to behave. But here we are not teaching it new knowledge. This is not what's happening, we are teaching it. We are communicating with it. We are communicating to it what it is that we want it to be.
And this process, the second stage, is also extremely important. The better we do the second stage, the more useful, the more reliable this neural network will be. So the second stage is extremely important too. In addition to the first stage of the learn everything, learn everything. Learn as much as you can about the world from the projection of the world, which is text.
It's striking that in order for large language models to emerge, we had to first have something like the world wide web where we digitized a sufficiently large set of our knowledge and interactions, so that there might be a corpus of training data that was large enough to start having the world model emergent behavior that Ilya describes above.
Human Language → Internet & World Wide Web (digital substrate for communication and language) → AI
At least the technology tree we're currently on that led us to AI seemingly required us to first obtain language, then computers, the web, and the coincidence of GPU chips (graphical processing units) that happened to be fast at performing vector operations.
Sir Tim Berner's Lee, the inventor of the World Wide Web, saw some inklings of the possibilities of AI even when he was working on the initial versions of the web in the early 90s. He writes about an attempt to make a sort of proto-LLM in his book "Weaving the Web":
In an extreme view, the world can be seen as only connections, nothing else. We think of a dictionary as the repository of meaning, but it defines words only in terms of other words. I liked the idea that a piece of information is really defined only by what it's related to, and how it's related. There really is little else to meaning. The structure is everything. There are billions of neurons in our brains, but what are neurons? Just cells. The brain has no knowledge until connections are made between neurons. All that we know, all that we are, comes from the way our neurons are connected.
Computers store information as sequences of characters, so meaning for them is certainly in the connections among characters. In Tangle, if a certain sequence of characters recurred, it tangles, links, and webs would create a node that represented the sequence. Whenever the same sequence occurred again, instead of repeating it, Tangle just put a reference to the original node. As more phrases were stored as nodes, and more pointers pointed to them, a series of connections formed.
The philosophy was: What matters is in the connections. It isn't the letters, it's the way they're strung together into words. It isn't the words, it's the way they're strung together into phrases. It isn't the phrases, it's the way they're strung together into a document. I imagined putting in an encyclopedia this way, then asking Tangle a question. The question would be broken down into nodes, which would then refer to wherever the same nodes appeared in the encyclopedia. The resulting tangle would contain all the relevant answers.
I tested Tangle by putting in the phrase "How much wood would a woodchuck chuck?" The machine thought for a bit and encoded my phrase in what was a very complex, tangled data structure, But when I asked it to regurgitate what it had encoded, it would follow through all the nodes and output again, "How much wood would a woodchuck chuck?" I was feeling pretty confident, so I tried it on "How much wood would a woodchuck chuck if a woodchuck could chuck wood?" It thought for a while, encoded it, and when I asked it to decode, it replied: "How much wood would a woodchuck chuck if a woodchuck chuck wood chuck chuck chuck wood wood chuck chuck chuck..." and it went on forever. The mess it had made was so horrendously difficult to debug that I never touched it again. That was the end of Tangle—but not the end of my desire to represent the connective aspect of information.
It really is pretty strange how these AI emerge from our language and digitization. It feels at times like these LLMs are almost like a mirror to ourselves. They are our Jungian collective consciousness digitized and made manifest into a cloud of probability-possibility.
To share one more reference along this theme of how LLMs emerge from us, I was listening to the Rick Rubin interview with Jack Clark one of the Anthropic founders, who offered his perspective on how he think about some of the work they're doing:
Rick: Tell me about the current arms race in AI.
Jack: Everyone is pulled forward by a sense of inevitability. There was a philosopher in the 90s called Nick Land, and he wrote an essay called Machinic Desire. And in that essay, he said, what appears to humanity as the history of capitalism is an invasion from the future by an artificial intelligence space that must assemble itself entirely from its enemies' resources. It's wild, but it gets at some of what we're dealing with. Like this thing comes out of your toaster in 50 years. Because your toaster will have a really powerful, probably at that time quantum computer. This stuff is based on basic, like well understood technology like neural networks. The race we find ourselves in is less a race and more, I think of it as, we're an advanced party working on behalf of the human species to meet something that inevitably comes at us in the future. All we can really do is buy time. To some extent, what we can do is we can use loads of computers to time travel into the future and see what comes out of these computers. And it gives us time to examine it, to work out how it's going to be useful to us and how we can partner with it, and also to work out if it poses risks. It's one of the rare cases where if you're worried Earth was going to be hit by an asteroid, you just have to wait for it to arrive.
Here we get to bring the asteroid closer and look at it. Now that has a range of really scary properties. But if your choice is you let the asteroid come at you at some point in the future, or you find ways to examine it now, I think that there are pretty solid arguments in favor of finding a way to safely examine it now and see what you learn.
[…]
I have an essay I'm working on called technological optimism and appropriate fear, which I'm then going to send to people in the valley who some people I think are too much optimists and some people are too fearful. I'm trying to be the middle ground with appropriate.
This past week, I've been watching the Ken Burns documentary on Leonardo da Vinci. In his later years, Leonardo spent a great deal of time studying the human body, and the development of the fetus. He made this beautiful drawing of the fetus in the womb, and wrote the following:
"In this child the heart does not beat and it does not breathe because it rests continually in water, and if it breathed it would drown. And breathing is not necessary because it is vivified and nourished by the life and food of the mother... And one and the same soul governs these two bodies, and desires, fears and pains are common to this creature as to all other animated parts. From this it arises that a thing desired by the mother is often found imprinted on those parts of the infant that have the same qualities in the mother at the time of her desire; and a sudden terror kills both mother and child. Therefore one concludes that the same soul governs and nourishes both bodies."
In this passage, I can see a semblance of our own relationship to these LLMs. These artificial intelligence in some sense implicitly emerge from our words, they are written or spoken into existence.
Similar to Davinci's exploration and deconstruction of the human body, these LLMs also open up the ability for us to start to explore and deconstruct how our language and thought processes might work. One of the areas I find most "awe-some" is that we essentially have a representation of concept spaces implicit in these models, and they give us a framework in which to explore concept space.
For now, I'll stop here, since this first post's already gotten a bit long, so I'll stop here and come back to edit it another time. I'll share some more of the suggested reading in the next little while.
The Emperor's New Mind: Concerning Computers, Minds, and the Laws of Physics
by Roger Penrose
View on Google BooksA few short pieces by me on computational mysticism
Some of my own writing that explores these themes:
Suggested by AI
Additional works that explore the intersection of computation, consciousness, and reality: