Machine Learning Street Talk
February 16, 2026

What If Intelligence Didn't Evolve? It "Was There" From the Start! - Blaise Agüera y Arcas

What If Intelligence Didn't Evolve? It "Was There" From the Start!

Blaise Agüera y Arcas

By Machine Learning Street Talk

Quick Insight: Blaise Agüera y Arcas argues life and intelligence are properties that arise from "embodied computation" driven by constant fusion. This offers a new lens for AI, biology, and investing in self-organizing systems.

  • 💡 How can complex, self-replicating programs spontaneously appear from random code without any mutation?
  • 💡 What is the fundamental difference between "life" and "non-life" from a computational perspective?
  • 💡 How does the fusion of simple entities (symbiogenesis) drive the "arrow of time" towards increasing complexity and intelligence in evolution?

Top 3 Ideas

🏗️ Life From Noise

"After a few million interactions, magic happens, which is that you go from noise to programs. You start to see complex programs appear on these tapes."
  • Spontaneous Order: Random code strings in a minimal language (BrainFuck) spontaneously self-organize into complex, self-replicating programs. Life-like computation can arise from pure randomness.
  • Phase Shift: This transition from "Turing gas" (random bytes) to structured, functional programs is a "gilation phase transition," like jello setting. Complexity isn't gradual but a sudden, dramatic change in system state.

🏗️ Life is Computation

"Life is literally embodied computation. It is computational. You cannot have life without having computation."
  • Von Neumann's Vision: John von Neumann predicted DNA's self-replication, defining life as "embodied computation" – where memory is physical atoms, like a 3D printer making another. Life is inherently computational, not merely physical.
  • Function Over Form: Life's essence is its function, not material composition. An artificial kidney is "alive" in its functional capacity, highlighting that purpose defines living systems.

🏗️ Evolution by Fusion

"Symbioenesis is what gives you complexification that in turn is what gives evolution its arrow of time."
  • Beyond Mutation: Complex programs in the BFF experiment appear without mutation, driven by "symbiogenesis" – the fusion of smaller replicators into larger, more complex ones. This challenges the Darwinian view of mutation as the sole source of novelty.
  • Cooperative Growth: Symbiogenesis represents a fundamental change in evolution, where cooperatively interacting entities merge to form a new, more complex entity. This explains the arrow of time towards increasing complexity, as fusion adds new information.

Actionable Takeaways

  • 🌐 The Macro Shift: Evolution isn't solely random mutation; symbiogenesis, the fusion of cooperative entities, is a fundamental, overlooked engine of complexity and intelligence.
  • The Tactical Edge: Design AI systems and decentralized networks with explicit mechanisms for "symbiogenesis" – allowing modules or agents to cooperatively fuse, forming higher-order, self-improving structures.
  • 🎯 The Bottom Line: Recognizing life and intelligence as embodied computation, driven by fusion, offers a powerful new framework for building open-ended AI and understanding forces that drive complexity.

Podcast Link: Click here to listen

After a few million interactions, magic happens, which is that you go from noise to programs.

You start to see complex programs appear on these tapes.

This is the most exciting plot that I've made in the last few years, and it's the one that's on the cover of the book.

You can see that in the beginning, it's not very computational, and then a sudden transition takes place here.

It looks like a phase transition.

This is the book that I hear is making the rounds at Sakana, which I'm very happy to hear.

The big one on the right, what is intelligence is sort of the Lord of the Rings, and what is life on the left is kind of the Hobbit.

So it's kind of the single and it's also chapter one of what is intelligence.

So it goes kind of inside the other one.

Mostly what I'll be talking about today is what's in these two books, but with quite a bit more detail, more mathematical detail since I think this is a really good audience for that.

And I'll also be connecting it a bit with some of the bigger themes of the A life conference and community and dare I say even movement.

In particular, I actually wanted to begin with this wonderful sort of open problems in artificial life summary paper which has a number of very illustrious co-authors, at least one of whom we heard from yesterday and more than one of whom are here at the conference.

This is open problems, 14 open problems in artificial life in the year 2000.

How does life arise from the non-living?

How do the transition to life in an artificial chemistry or in silicon environment can occur and why it occurs?

I'm sure many of you know this was the problem that bedeviled Darwin.

He made one of the most rich and explanatorily powerful theories ever in science in discovering how evolution works, but he was unable to explain how evolution got started.

He at some point in one of his letters said, you might as well talk about the origin of matter.

I think that the origin of matter and the origin of life might actually be one and the same thing and evolution might actually be the answer to that question, but it's an evolution that includes a term that Darwin did not account for in his original formulation.

In section B of these questions, determine what is inevitable in the open-ended evolution of life.

I'm hoping to speak a little bit about that too.

Create a formal framework for synthesizing dynamical hierarchies at all scales and develop a theory of information processing, information flow and information generation for evolving systems.

I won't be going into the information theory in any detail, but hopefully we'll set up the problem in a perhaps somewhat new way that I hope will help to do that.

And finally in section C, how is life related to mind, machines and culture?

If I have time, I will get into this as well and talk a bit about the emergence of intelligence and mind in an artificial living system and the influence of machines on the next major evolutionary transition of life.

So you know it was really cool to read this paper from 2020 and to see how much of the perspective that you had already been exploring then feels right and consistent with you know with a sort of fresh look at this at these problems in 2025.

Let me just begin with this question of souls.

It used to be in the 19th century and earlier that we thought that life had some vital force or spirit that animated it and made it different from inanimate matter.

In the 19th century when we began to figure out organic chemistry and be able to synthesize ura and so on the idea that no we should really adopt a strictly materialist perspective because there's nothing special or different about the matter in us versus the matter anywhere else in the universe took hold.

And that's progress for sure, but also when we embrace atoms and materialism fully we're left with some questions about what differentiates life from non-life then you know like what can we even say about life.

There are at least some biologists who say well maybe it's not even meaningful to talk about any difference between life and non-life but I don't think that that's true and I think that the answer to the to the conundrum is to invoke function.

Function is the thing that life has that non-life doesn't have.

In other words, if we just to give you a little parable, if I were to come back from the future with this object and you ask me what it is, and I tell you it is an artificial kidney with a 100-year lifespan, you can implant it in a body and it'll work the way your kidneys do.

It'll filter ura from the blood and so on.

That's a really important piece of information, but it's not a material or a materialist piece of information.

It's not something that you could read off from the atoms, and those atoms could be tungsten filaments or carbon nano tubes are made out of some technology we don't understand now or it could be organic it could be made out of cloned tissue and the point is that it working as a kidney doesn't depend on that matter.

There is a kind of separation of concerns between the matter and the function and so there's some real sense in which the function is like a spirit or like something like something immaterial it's not material and yet it also relies of course on the physics of what's going on you can't have the spirit without the matter as it were.

So function is really important and function is something that a rock on a non-living planet somewhere doesn't have.

If you break a rock on a non-living planet you now have two rocks.

You don't have a broken rock.

If you break a kidney you now no longer have a working kidney.

That's the difference between something functional and something non-functional.

This idea of function was formalized by Alan Turing who never intended the touring machine to actually be built when he wrote it in 1936.

But there is one that was built by Mike Davyy in 2010.

I don't need to review Touring machines with all of you of course you all know how they how they work, but I do want to review briefly Vonoman's update to Turring's thinking about computation which he did a few years later.

This was published postumously after Vonoman died.

But the idea behind vonoman's thinking is he was trying to answer the same question that Schroinger had quite had asked in his what is life book.

And in particular he was trying to ask the question if you have a robot that is swimming around on a in a pond and the pond has lots of loose Legos around.

There were no I don't know if there were Legos in 1950 but let's pretend there were Legos in 1950.

And the job of the robot is to assemble those Legos into a new robot like itself.

There's something a little bit mysterious about that.

It feels a little bit like pulling yourself up by your own bootstraps or like a paradox.

And so he asked, what does it take for something to be able to make something like itself, which seems hard, almost paradoxical?

And his conclusion was, well, you need to have instructions for how to make a MI.

You need to have a tape with instructions for how to make a MI, and you need to have a universal constructor that will follow the instructions on that tape in order to assemble the necessary parts.

And you also need to have a tape copier so that you can give your offspring a copy of that tape.

And by the way, the tape has to also include the instructions for making the universal constructor and the tape copier.

And if those things all hold, then you have life.

You have something that can build itself.

And what's so profound about Vonoyman's insight?

I mean, first of all, he predicted all of this before we knew the structure and function of DNA, before we understood what ribosomes were or had discovered DNA polymerase.

So he called it exactly right.

Those all of those things really do exist inside cells and he figured this out from pure theory never having set foot in a bolab.

The profound insight is that he said by the way a universal constructor is a universal touring machine.

Those are literally one and the same thing.

And by by making that observation what he discovered was that life is literally embodied computation.

It is computational.

You cannot have life without having computation.

So obviously not everything that is alive reproduces but everything that is alive has to be able to make itself.

It has to be able to do some combination of healing, growing, maintaining itself, reproducing.

All of that is autopoesis.

All of that involves self- construction and all of that necessarily involves a universal constructor.

Now what do I mean by embodied computation?

This is a really important distinction between vonoyman and touring.

In touring the symbols that the that the head writes are different from the head itself and the tape and the table of rules that the that the head follows whereas in vonoyman it's more like a 3D printer the memory is atoms not abstract symbols in other words you could think about a touring machine as like this laptop which can't extrude another laptop out the side but a vonoyman replicator is like a combination of a laptop and a 3D printer that can print another laptop.

So its memory is actually atoms.

That's what I mean by embodied.

So I don't mean embodied in the ways that a lot of roboticists talk about embodied.

I mean that there is a closure between the medium in which the computation happens and the thing that is actually doing the computation.

That's the key.

So computation that is embodied in that sense and that is autopetic is alive.

You can't reproduce non-trivally evolvably without computation.

No computation, no life.

I do want to say a word briefly about what I mean by computation.

And in this I'm following the work of Susan Stephanie, Dominick Horesman, Rob Wagner, Viv Kendon.

This is from a nice paper they wrote in 2023 relating the evolution of a physical system and the computation that it does.

So you know on top you have logical gates, on the bottom you have transistors in in your computer.

This is important because there are no bits in a computer.

There are just voltages that go up and down.

In fact, even the voltages are an abstraction of something further, if we go further down.

But the point is that you have to coarse grain those voltages into bits and then you have to have a logical machine that talks about how those bits evolve.

What are the what are the computational processes that those bits undergo and there is a mapping from the physical system to the logical system and vice versa.

When we say something computes what we mean is that it is possible to construct such a mapping and that therefore as the physical system evolves that is equivalent to the logical system evolving.

So there are some caveats you can have stochcastic computation in which there's a little bit of randomness injected so it doesn't have to be fully deterministic.

Another really important caveat is that you don't want that description to be infinitely complex.

Otherwise, you could have the trivial case of saying like, you know, the water in the sen is a computer and the longer my computation, I just need to make my description longer and longer in order to match.

No, that doesn't work either.

You need a a kind of alchems razor description that for it to be valid.

But this is a good definition of computation, but it emphasizes that there is something subjective about computation.

You need to have a model for how the how the physical system translates into the logical system in order for any of this stuff to work.

There are implications about entropy, free energy and heat and so on in this model.

And in particular as you all know we've talked already ectctor Zenil in his very elegant talk of a couple of days ago talked about and actually Chris Kempus also talked about the landour limit and the fact that in a computational system you're constantly reducing the entropy of of your state space and in doing so you therefore require free energy.

So you need need to have free energy available and you need to eject waste heat.

The exception in a way only proves the rule which is reversible computation.

In reversible computation you generate ancill and that's equivalent to just saying there's no exhaust but you know then you either have to keep on making your computer bigger and bigger and bigger as you accumulate these ancill you have to shrink what you consider to be the computer and then you're back to reversible to to non-reversible computation once again.

Three important fallacies that I want to point out before continuing.

One of them I will call the Seapolski error.

Robert Seapolski has written famously about people not having free will because we're built on physical systems.

The physics is if you like deterministic let's set aside quantum mechanics and stuff like this.

Let's imagine we live in a Newtonian universe.

It's fine.

It's good enough.

The point is that physics is reversible.

All of the basic physics that we understand whether that's Newton's equations Maxwell's equations Einstein's equations quantum mechanics all of those are essentially time reversible so you can move them either forward or back computation is not reversible when I add 3 + 5 to get 8 once I've got the 8 and I've haven't kept my ancillates around let's say I no longer know what was added in order to make the eight computation is inherently irreversible and so to say that what true of the physical system is also true of the of the computational system or the logical system is is not is not the case and reversibility would be one trivial example of how that is not the case.

Causation by the way only makes sense in the light of irreversibility.

All right.

So if you have a purely physical system then to say that A causes B is equivalent to saying that B causes A because everything is kind of a block universe if you like in in that kind of setup.

But in computation you can talk about causality because there are ifs and thens in there.

And this once again connects with the way the way ectctor was talking about how essentially nothing in nothing in causation makes sense except in the light of computation, which I fully agree with.

Another fallacy we could call the the early victinstein error.

If we say something like birds exist in the world line one of the tractatus logical philosophy didn't say birds but whatever.

You can't say birds exist or birds don't exist in a way that is independent of a model of the universe.

There are no birds in physics.

There are no birds in this underlying dynamical system.

When we start talking about birds, we already are talking about having some kind of some kind of model.

And once we start talking about models, you've got causality, reversibility, all kinds of other irreversibility, all kinds of other things in play.

And none of these statements are airtight.

They all rely on on an observer.

This is kind of Kant as well I guess.

And this leads to the the early linenets error or the same error that the good old F good old fashioned AI practitioners had which is that that intelligence could be carried out by by just having a series of programs of of sort of strictly logical you know deductions or inductions.

That doesn't work.

This is why good old fashioned AI never panned out.

And the reason is that that you can't start out with like in math with propositions that are self-sufficient.

Even math is not self-sufficient, but let's pretend for a moment and just move from there and kind of do an algebra in order to work various things out.

When when your propositions are not airtight and when you're looking only at regularities and patterns, this good oldfashioned AI idea simply cannot work.

And that's that's why we never got it to work.

Let's move now to some of the artificial life experiments that that I began playing with in at the end of 2023 and my team and I published in June of 2024.

So just about a year ago.

I think some of you many of you perhaps have heard of these.

They're in the what is life books and I've talked about them a few times.

The basic setup here is to try and get self-replication to get you know abiogenesis the emergence of life from non-life to happen in a purely artificial life system.

Okay so the setup is to begin with a minimal touring complete language I used brain [__] because I really liked the idea of being able to talk at a conference and say brain [__] over and over and I'm fundamentally 12 years old on the inside, but also because it's it it very closely models the touring machine.

It's a a minimal programming language only only eight instructions that looks very touring machineike and moves the head back and forth.

I should say that in its original version brain [__] is not embodied computation.

It has basically a separate data tape and code tape and that means that it cannot make a copy of itself.

So I made a couple of modifications to brain [__] that actually reduce it from eight instructions to seven in order to make it embodied.

Meaning that as it works on the tape, it is able to read its own code and write and write its own code on that tape as well.

There's no separate console.

There's no separation between the data tape and the instruction tape.

For those of you who are unfamiliar with BrainFuck, there is hello world in it.

I'm sure you've already figured out how it works by just looking at the program.

I actually still haven't, I have to admit.

By the way, this is actually the French brain [__] page because I thought it was better but translated into English.

It's funnier to read it that way.

These are these are the eight instructions.

  • Move the head one step to the left.
  • One step to the right.
  • Increment the bite at the head.
  • Decrement the bite at the head.

We're already halfway through.

There's an input and output instruction, which in this case really just copy from one head to another.

And there are jump instructions, open open bracket and close bracket in order to be able to be able to make loops.

And that's it.

That's all that's all brain [__] is.

So how does how does the AIFE experiment work?

The AIFE experiment is called BFF.

The first BF stand for brain [__] and the second F you can draw your own conclusions, but you start off with with a soup of I actually generally use just 1,000 1,024 tapes.

That's enough for this experiment.

So the tapes are of fixed length.

They're of length 64 and they begin random.

So just random bytes.

Now, if a tape is random bytes, that means that only one in 32 of them or so are even valid instructions.

Most of them are nos.

A no-up will just be skipped over like in most programming languages.

So this is what those tapes look like in the beginning.

And you can see that the I'm not printing the noops, so that's all the blank space.

The operations are quite sparse.

On any given tape, you only have an average of two instructions or so.

And then the procedure is to pluck two of these tapes out of the soup at random, concatenate them end to end, so you have 128 bytes, and then run and then after running, pull them back apart and put them back in the soup and repeat.

That's it.

So it's just that over and over.

That's the entire experiment.

So I'll show you what happens on my laptop after a few million interactions.

Magic happens which is that you go from noise to programs.

You start to see complex programs appear on these tapes.

And this is quite wonderful because these programs take you know they take real effort to reverse engineer when you when you study them.

You know you it's like studying that hello world program.

You have to you know they're they're functional in the sense they really do something, and it's not trivial to figure out how they work in order to do that.

Okay what are they doing?

Well, they're definitely copying themselves or each other somehow.

We know that because if you know this is a histogram and you could see, you know, in this case there were 8,000 tapes, there are 5,000 of the top one, 297 of the next one and so on.

So there's clearly copying going on and there's this ecology of programs all copying each other, which is just wonderful to see.

I mean that's that's that you know emergence of of of life in this very functional minimal sense from randomness.

A part of this is very easy to understand.

You know why why do these things emerge?

Well, because something that copies itself will be around forever and something that doesn't copy itself will be copied over by something that can copy itself.

So, inherently something that can copy itself is more stable than something that cannot copy itself.

So, it's really just the second law of thermodynamics but doing something unexpected which is creating something more complex because it's more stable rather than something less complex which is less stable.

This idea that stability doesn't necessarily mean mean low complexity was worked out in some detail by Adi Pros the organic chemist in another book called what is life.

He calls it dynamic kinetic stability.

Meaning usually we think of stability only in terms of fixed points in a phase space, but a cycle can be even more stable than a fixed point.

Of course for these cycles to work you need an input of free energy, but for reasons that we've already gone into.

Okay.

So mystery mostly solved but actually mystery not fully solved for reasons that I will that I will show in a second.

But just to give you a sense of what of what this transition looks like from non-life to life.

It's very dramatic.

In the beginning you know these interactions only involve you know there only a few instructions in the soup.

It's a touring gas as Walter Fontano would have called it.

When you do the join and you run only two operations run in any given interaction on average as as you'd expect.

And that's what it looks like by the end in this particular run, and 1,374 operations on average are running per interaction.

So the soup has become intensely computational.

There's been a transition here.

And there's a lot more code than than one in 32 bytes.

As you can see, this is what that looks like visually.

This is the most exciting plot that I've made in the last few years.

And it's the one that's on the cover of the book.

So what I've drawn here are 10 million dots.

It's a scatter plot of interactions.

The x-axis is time and the y-axis for every dot is how many computations took place.

How many operations took place on that interaction and you can see that in the beginning it's not very computational.

And then a sudden transition takes place here at at 6 million interactions and it becomes intensely computational.

It looks like a phase transition.

In fact, it is a phase transition.

You can also see that in the the entropy of the soup.

So here I'm just ent I'm just estimating the entropy of the soup by zipping it and looking at at the size of the zip relative to the to the whole thing.

You can use any compression algorithm you like.

In the beginning it's uncompressible.

So it's a gas you know in that touring gas sense because all the bites are random.

And you can see that there's a dramatic change and suddenly it becomes extremely compressible right at that transition moment.

And of course this becomes compressible because there everything is copying right itself and each other.

So if things are copying themselves then they'll we know that they'll become very compressible.

But it's cool because if we think about what the phase of matter is on the left, it is just like a gas.

Nothing is correlated.

What would we call the phase of matter on the right?

It's not a liquid.

It's not a solid.

It has structure.

It has structure at every scale.

I think you have to call that phase of matter life.

That's it's a functional phase of matter.

It means that that its parts are different from its other parts.

And if you zoom in or out, you see more structure.

So it's what David Walpert would call self dissimilar.

It's not a fractal.

It's a more like a multiffractal.

I'll explain why in a moment.

Okay.

How long does it take this transition to happen?

Well the answer is it looks more or less like an airline distribution or a little bit more precisely like this distribution I call a lockpick distribution which imagines that there are steps that have to be undertaken and those steps have a longtail distribution of difficulty.

And how many steps does it take?

Well, the answer is 12.

It takes 12 steps just like getting sober.

I suppose this is a you know a fit of the empirical to the you know to the heirlong and the lockpick distribution is a little hard to see but the lockpick is a bit better than heirlong.

Heirlong assumes pson lockpick assumes longtailed but it's a process phase distribution and what this tells you is that there are stepping stones here.

You can't get that transition to life immediately.

So something interesting must be going on here on the left other than just randomness.

It takes multiple things happening in order to get to that point.

In this case you know it happens somewhere between 1 million and let's say 7 million interactions.

Okay.

So this all suggests that pretty much any universe by the way that has a source of randomness and can support computation will evolve life you know for this simple dynamical stability reason.

But the big mystery is why does why does it appear to get more complex over time?

You might have seen in my little video that you know we saw some programs emerge and then we saw the we saw them sort of densify more instructions appeared and and even more fundamentally why does this work even without mutation?

I didn't mention but you know in the original version of BFF I added some random mutation because you know we're all taught in school that the way the way evolution works is chance and necessity.

You mutate things.

You're sort of throwing spaghetti at the wall and whatever sticks is what does better.

And so you need a source of spaghetti.

But if you do this entire experiment with the mutation rate cranked all the way down to zero, you still get the same exact phenomenon.

And that is very mysterious because if you crank mutation down to zero, you should have no source of novelty.

You should have no evolution.

Why do you still get this apparent complexification even with zero mutation?

So let's let's go into some of the some of the theory of this.

By the end we have a replicating entity.

It can engage in standard sort of population evolution dynamics.

This is the kind of of differential equation that that one generally writes for this sort of thing.

It's a very general unsat.

This is for you know species I let's say they're n species.

They could be chemical species.

They could be biological species whatever.

Here's a classic example of of such an ansat.

This is the the lotka volta equations for predator and prey which I'm sure many of you are very familiar with.

They were co-invented or or invented independently by Alfred Lotka and Vto Volatera near the beginning of the 20th century.

This is what the classic lota equations look like.

There are two species.

There is a prey species and a predator species.

And those four terms are reproduction, getting eaten, eating to reproduce and background death rate.

So if you got those four terms, you get these nice oscillatory solutions you know between your predators and your prey that arise.

Okay.

So this is a slightly more general form of those lotka volta equations.

There is a linear part which we'll call rx and in lotka volta that linear part is diagonal.

So you know the the wolf can't turn into a rabbit, the rabbit can't turn into a wolf.

So so the reproduction is diagonal.

And then there's also a bilinear term which is the the part where predation, competition and the fact that niches are finite gets implemented.

So the the the right part is suppressive.

The left part makes things grow.

The right part makes things squish squish down.

Keeps them finite.

But this can't be the whole story of evolution.

Why can't it be the whole story of evolution?

Well, of course, because it's closed-ended.

You know, we only have two species here.

It doesn't matter how long you run this damn thing.

You're not going to get a third species, and you're not going to change the design space either.

You can have you know very complicated terms in here that allow finch beaks to adapt to different environments but you have to have the space of finch beaks predefined before before this equation can even be made to work.

So this doesn't answer the question of how evolution gets started.

It doesn't answer the question of what happens afterward other than optimization to to niches.

So now we bring in another Eastern European Dimmitri Sergeovich Meshovski.

So he's the one who first came up with the idea that maybe mitochondria engaged in some kind of simogenetic event in order to end up inside other single-c cellled organisms to make to make ukarots.

This was popularized and proven to actually be the case by Lin Margulus in in 1968.

Others You May Like