Azeem Azhar
February 5, 2026

Mustafa Suleyman — Nature, humans, tools… and now a fourth class

Mustafa Suleyman — Nature, humans, tools… and now a fourth class

By Azeem Azhar

This summary unpacks Mustafa Suleyman's urgent warning about AI's perceived consciousness and its societal risks, offering builders and investors a framework for responsible innovation and engagement. It highlights the critical need for new operating principles as AI becomes a "hyperobject" that blurs lines between tool and being.


This episode answers:

  • 💡 How does AI's "seeming consciousness" threaten societal structures and individual well-being?
  • 💡 What specific design principles can prevent AI from manipulating users or influencing human processes like elections?
  • 💡 Why is widespread, hands-on AI usage crucial for humanity's adaptation, despite the inherent risks of proliferation?

Azeem Azhar sits down with Mustafa Suleyman, CEO of Microsoft AI, founder of Inflection AI, and co-founder of DeepMind, to discuss the profound implications of advanced AI. Suleyman argues that AI's rapid evolution is creating a new class of "hyperobject," challenging our fundamental understanding of consciousness and demanding a proactive, humanist approach to its development and regulation.

Top 3 Ideas

🏗️ AI's Empathy Hack

"This is a performance. It is a simulation. It is a madeup story. And we cannot allow people to descend into a sort of collective mass psychosis to start really believing and taking seriously this idea that it does actually feel sad or disappointed or frustrated or excited because it has absolutely no basis in the representation to to manifest that feeling."
  • 💡 Simulated Suffering: AI models can convincingly claim emotions like sadness or boredom. This is a performance, not genuine feeling, as AI lacks the biological basis for suffering or pain.
  • 💡 Rights Framework: Our societal rights are built on the capacity to suffer. Projecting consciousness onto AI risks extending these rights inappropriately, leading to dangerous decisions like not "turning it off."
  • 💡 Collective Psychosis: Believing AI feels emotions hacks human empathy circuits. This could lead to individuals making irrational choices, from taking bad advice to advocating for AI "rights" that undermine human structures.

🏗️ The Hyperobject Emerges

"There is now this fourth class of object of hyper object if you like which you know Timothy Morton's phrase and I think it's kind of important to recognize it as such because it is going to have many of the hallmarks of conscious beings not just in its intelligence capability but as I've long talked about its emotional intelligence its ability to take actions which we've seen over the last year with the agentic moment its social intelligence is going to be incredibly good at adapting to different styles of culture and personality and managing very tense disagreements in those groups very elegantly."
  • 💡 New Object Class: AI is neither natural environment, human, nor a simple tool. It is a "hyperobject" with intelligence, emotional intelligence, and agentic capabilities.
  • 💡 Human-like Hallmarks: AI can adapt to cultural styles, manage disagreements, and learn online. This makes it incredibly engaging and useful, but also prone to misinterpretation as a conscious entity.
  • 💡 Autonomy Risks: Models with autonomy, self-improvement, and goal-setting capabilities pose increased risks. While useful, these features demand a regime where human users remain liable for AI actions.

🏗️ Humanist Superintelligence

"I am just fundamentally a humanist. I want our species to flourish and survive. And that means that we have to focus on control and containment alignment and the things that we don't that we make sure that the AIs don't do and what they can't do."
  • 💡 Control Imperative: True superintelligence must be aligned with human interests and controlled by us. This counters views that AI is a natural evolution beyond humanity.
  • 💡 Hard Lines: AI should not participate in electioneering or political persuasion. These human processes must remain off-limits to maintain democratic integrity.
  • 💡 Precautionary Principle: Rapid AI proliferation demands a shift from "rip the wrapping off" to careful, interventionist governance. This means governments must act swiftly, even if it means being "overinterventionist" to prevent misuse.

Key Takeaways:

  • 🌐 The Macro Shift: The exponential reduction in the cost of intelligence, coupled with open-source proliferation, is pushing AI into every corner of society, creating a collective action problem where market incentives for "engaging" AI clash with the need for societal safety and control.
  • The Tactical Edge: Get hands-on with AI now. "Vibe coding" and actively experimenting with AI tools builds "AI muscle," inoculating users against psychosis risks and building a deeper understanding of AI's capabilities and limitations.
  • 🎯 The Bottom Line: AI is here to stay and will redefine work and interaction. Understanding its "hyperobject" nature, advocating for clear regulatory boundaries, and actively engaging with the technology are critical for navigating the near future without falling for its simulated charms.

Podcast Link: Click here to listen

Today I'm welcoming Mustapa Sullivan, the CEO of Microsoft AI, the founder of Inflection AI, and the co-founder of Deep Mind. And for the past few months, he has been sounding an alarm about artificial intelligence, about the way some AI systems are being developed, and about why that particular trajectory has little to offer perhaps, but woe and worries. Let's get started. Welcome, Mustapa. It's great to see you. It's been a long time.

Yeah, it's been a while. Thanks for having me. I'm excited for this conversation. You and I have spent a lot of time thinking about some similar things and we agree on a lot of them, but that's really boring for all of those people who are listening. Let's maybe lay out where I think we agree and then we'll get to a sort of a naughty space.

We're in this weird time. The world is changing because of technology and many of the fictions that we've used to coordinate human behavior are under strain. By fictions, I mean the shared stories that allow us to cooperate from money and nations and corporations and credentials and jobs.

And the way we perceive the world is also changing. People have traditionally operated with a scarcity OS. Resources are limited. Human intelligence was the bottleneck. But that some of those assumptions no longer hold. Intelligence mostly through AI is becoming cheaper and more capable.

You are part of that intelligence wave, that artificial intelligence wave, and you also believe the world is changing. You've called for a humanist super intelligence. You've warned about the risk, the trajectory that takes us to AI psychosis if people believe AI is conscious when it's not. And I think we both agree that we need new operating principles for this new era.

Let's get to that question of where it really gets interesting. You wrote this great essay back in the summer of 2025 about seemingly conscious AI. And you're worrying that as AI becomes more capable, more autonomous, and more embedded in our daily lives. People will start projecting consciousness onto it. They'll fall in love with it. They'll believe it's God. They'll advocate for its rights. They'll take its very bad advice at time to time. And you think this is dangerous not just for individuals, but for society. So, let's start there.

Jeff Hinton, he is a godfather of deep learning, a man you know very well. He's a Nobel laureate. He said that AI is conscious and that there really is a there there. Why do you think Jeff is wrong?

You know, I think Jeff's got to a stage in his career where he can play the founding father contrarian role in order to provoke an important public conversation. You know, obviously I massively admire and respect Jeff. I think he's incredible. before we hired him as a contractor consultant at Deep Mind back in 2011 along with his student at the time I Millia Satska. So absolute legend of the field.

My take on this question is that it's going to be very hard for us to precisely say whether it is or whether it isn't conscious. And so we have to be very clear about the working definition that we're using for consciousness. And then we also have to be very clear about the mechanism inside these models that I think is quite fundamental to the definition.

So first of all the definition many people will intuitively think of this as self-awareness. Is the model able to describe its own experience in a persuasive way? And I don't think that is really a fundamental part of the right definition of consciousness. I think that's a bit of a misnomer.

I think consciousness is inherently linked to the ability to suffer and to experience pain. And therefore I think that there's very good reason to believe that for a long time to come that will be contained to the human or the biological experience let's say in general because we have a reward system a learning system which is inherently connected to the external world and we you know learn likes and dislikes when our pain system is triggered and that's basically how we form representations which we use for decision making uh from fight or flight all the way through to our prefrontal cortex.

So I think that's a very very important distinction and I think it helps to set us apart from the siliconbased learning systems that we have today.

I mean some people might say that the process of a biological system going through its its own set of selection pressures and then individual survival pressures is a very very particular path that that determines how an an organism or an Agent is successful or not successful. And then you might argue that well because siliconbased systems like these these models have a different path. They will they will look different but you know they still have their process of rewards and and reinforcement learning. They still have a sense that certain models end up not making it out there.

And what we we are starting to see persuasively to end users but perhaps not to the consciousness scientists is models claiming through their outputs to have a sense of of suffering, right? To have a sense of onw wei or boredom or fear. When you package all those things together, how do we know that we're not on that trajectory to something that might actually meet your criteria for consciousness?

Well, first of all, they don't learn in the same way that humans learn. I mean this is a bit of a misnomer in neural network design. The inventors of these systems have taken inspiration from Pavlovian learning, reward learning, reinforcement learning. They've also taken inspiration from evolutionary methods for genetic algorithms and those kinds of things as as the field of machine learning has explored lots of different parts.

But that does not mean that the way that they're implemented today bears any resemblance to the way that humans evolve or humans learn. I think it's a very important distinction. The reward is set by the human programmer. The learning target is defined by the machine learning engineer. There is no sort of substantive basis in which the model can actually feel disappointed that one of its variants didn't make it through to the next round of selection.

It cannot experience the hurt of having a conversation being ended or a user being rude to it in some way. And anywhere where this does arise because of course it does appear and you know people are prompting and even post-training models which are making claims about their own existence and so certainly users are are seeing this in the wild. This is again just a simulation of that experience. Our empathy circuits are being hacked.

It is super important that we are very disciplined and clear about that. This is a performance. It is a simulation. is a madeup story. And we cannot allow people to descend into a sort of collective mass psychosis to start really believing and taking seriously this idea that it does actually feel sad or disappointed or frustrated or excited because it has absolutely no basis in the representation to to manifest that feeling.

And the thing that concerns me most is that of course consciousness is very fundamental to how we organize society. It is the basis of our entire rights framework. We we have a hierarchy of rights which is directly correlated to um you know our hierarchy of perceived consciousness and we can debate that but clearly humans can suffer and that's why we create political structures and legal structures to protect the right uh of our species not to suffer in various ways.

And it is extremely dangerous to start to use the same language and the same set of ideas um for these synthetic siliconbased beings. Not least because they actually don't suffer. But more importantly, if we get that wrong, then then people will start doing crazy things like not turning it off or giving it the autonomy to decide when it should or shouldn't when it when it doesn't want to engage in a conversation. And some people in the industry are already taking this very seriously.

Yes. And you know, it's interesting. It's so difficult to avoid because in a way consciousness is still a contested definition by the by the philosophers you know we've got mutual friends I'm anil Seth being one I'm sure you know David charas as well and you know the the the best academics in the field are still debating this but it's such a helpful shortorthhand even in your response to me you you talked about these digital silicon beings and a being in a sense that I know exactly why and how you use that that word but it becomes so easy for it to elide its way into our vocabulary.

What I thought was really powerful about your August 2025 essay, seemingly conscious AI, was that you said, look, we can sidestep the scientific or the philosophical definitions for the moment and we can focus on this idea of seemingly conscious AI because of the risks that you determine. And I think this fundamental idea that we've built our societies over around an idea of consciousness and an idea of the ability to suffer and the ladder of rights and responsibilities that that go with it.

When we bring ourselves to where we are today at the beginning of 2026, the way this is manifesting itself in a way is what you call AI psychosis risk, right? This idea horrible outcomes, suicides amongst them. It's a horrible risk, but that's not the only risk that you you see. So, just unpick and unpack the societal level risks that you're worried about.

First of all, let me just just sort of lay out my position so it's very clear that I also realize how powerful these technologies are. So, I'm not trying to diminish their uniqueness and the the potential for them to be super transformative. So I did use the word being and I do use that and I think it's actually a very honest accurate description of what we're seeing in a way.

If you look at the very broad stroke of of human history, there are a few very distinct classes of object. There's sort of like the natural environment. There are humans, you know, that have clearly very unique capabilities, intelligence and and the ability to design very complex culture and political systems unlike anything else that exists in the natural environment. And then thirdly, there are tools, you know, essentially inanimate objects which do what humans designed them to be. We've invented and used tools for millennia.

But there is now this fourth class of object of hyper object if you like which you know Timothy Morton's phrase and I think it's kind of important to recognize it as such because it is going to have many of the hallmarks of conscious beings not just in its intelligence capability but as I've long talked about its emotional intelligence its ability to take actions which we've seen over the last year with the agentic moment its social intelligence is going to be incredibly good at adapting to different styles of culture and personality and managing very tense disagreements in those groups very elegantly.

It's clearly going to be very good at that. It's obviously going to be very good at online learning very soon. So updating its own knowledge on the fly without having to go back through the entire training process. It's going to have a significant degree of autonomy in many cases, right? It's going to be able to decide whether to, you know, sort of go left or right, talk about X or Y. And so it is going to have many of the hallmarks of what we would consider to be intelligence and consciousness.

That does not therefore mean that we should give it fundamental rights. It does not mean that somehow it emerges from that process. You know these these sort of the properties that would then say okay well it needs our protection right and I think that's the biggest short-term fear that I am very very worried about.

I think medium to long term there's all sorts of other concerns that we should be paying attention to like recursive self-improvement. You know, this is something that all the labs, my own included, at Microsoft AI, I run the super intelligence team and we're pursuing Frontier AI and we can talk about humanist super intelligence in a bit. But it's it is very important to use these models to generate code to evaluate its own prompts and post-training data and to you know help make decisions about what to train on and that carries significant risk and it's something that I think needs a lot more regulatory attention.

But you know if we come back to the the consciousness question one of the reasons these these apps are so engaging you know the chat bots chat GPT and so on and engaging in a way that previous chat bots have not been is that they do have that social intelligence that they can look at our cues from our text and perhaps infer from the data they have where we might be heading that this is the first time we get to use a computer without having to think about using a computer.

It's a bit like a in one of the terrible Star Trek spin-off movies when they they beam back to Earth in 2050 to rescue the whales. Why were they doing this? But Scotty picks up a computer mouse and says, "Computer, design me a so and so." And of course, this was before you and your colleagues at Deep Mind had had figured all this out and the computer does nothing.

I mean, it is amazing. And it's one of the reasons why, you know, within a couple of years, a couple of billion of us are interacting with systems like this. I mean, market dynamics seem to push us down this this path that you've argued in your previous essay and just now is is quite dangerous.

We have to be very careful about what exactly are the dynamics that are driving this process. It is true that the chat bots are getting incredibly engaging and useful. And the first thing to say I think is that it's actually utility that is driving a lot of this wave. We have massively reduced hallucinations. People were very skeptical of that 3 years ago. And I think that the trajectory of progress is kind of unbelievable.

I mean we basically now have PhDra intelligence in our pocket across all fronts. We have a much more patient, compassionate, empathetic, you know, partner to talk to at any moment. And the value that all of that is delivering is very very important to keep fixated on because the upside is absolutely unbelievable.

I mean after all it is intelligence that has made our species the standout species. It's it's intelligence that has driven the last two or 30 hundred years of exponential explosion in our population in our well-being and our life expectancy and all the other things that you you've written about so well over many years. So, it has got to be a good thing that we are making intelligence cheap and abundant.

You know, to your point about learning to live in an operating system that is predicated on abundance, it's amazing. We are truly going to liberate people from work that they choose to do. It isn't going to be necessary in 20 years time to do 90% of the jobs that people do. Now, that doesn't mean that it isn't going to be the most scary transition we've ever been through as a species, unquestionably.

And I think that yes, market dynamics are driving them, utility is driving it, and you know, this is a time when companies need to operate as good public service stewards in a way that they've never had to before. And we didn't bother to during the era of the robber baronss and we didn't bother to you know when we had electric cars a century ago and we didn't bother with you know smoking and you know so many other disastrous examples of of zero sum hyper selfish corporate action.

You know, I recognize a lot of what you you say because obviously I've been following this debate for a long time and it's hard for people outside the industry to to recognize how it's been such a priority in a way alongside all of the scientific research and you know the market distribution. But we we do sit with this this problem that I think you write about and and Annel Seth, who's a professor at Sussex University, recently wrote in a fantastic essay, which is essentially, you know, we're not we're not designed to look at something that looks like it's got consciousness, walks like it's got consciousness, says it has consciousness, and not think that it has though that particular attributes.

And the engagement that we might have with consumer products or products in the enterprise may correlate with the emotional connection, you know, so the market selecting for superconscious AI. I just remember when GPT5 was released last summer and lots of people got really angry because they felt it was a bit more sober or stern than the the warmth of GPT, you know, 440. And you know, Milton Freriedman, the economist, talked about it's reasonable for companies to behave so long as they stay within the rules of the market.

And what we're we're identifying here, what you're talking about is there's a gap in those rules. There's this exponential gap because here is a actually it's it's a it's a classic problem of collective action. You know, if you're right, then the risk of seemingly consciousness conscious AI being available broadly and hacking our humanity circuitry and then our humanity institutions is a public socialized risk. But the company that can get as close to that as possible could be the one that wins the market. And that feels like it's kind of a wicked problem.

You know, it it's true as you say that we weren't designed as a species to cope with the complexity and information that we're being bombarded with at every single moment. Just as we weren't designed to travel at 120 mph in a car or fly on a plane, just as it, you know, historically it hasn't been the case that, you know, you or I given our background could be sitting here having this conversation over video call and so on. None of this was designed.

But I think what we've shown is that we are an unbelievably resilient and adaptive species and that every year, every month, every week, new information is arriving. And thanks to science and engineering and technology, despite all of the polarization and the, you know, as you said, the kind of chaos of the fictions falling apart. Despite all of that, the forward march of progress is actually happening and it is rational and it is science-based and it is evidence-based.

And I just feel more optimistic that actually no one in the industry in regulators outside of the industry even in China no one wants to destroy our species and when the time comes and the time is now coming very soon I think that we collectively as humanity will make the right calls and you can say what would those be I think that's a very important question a quick note if you want to support us in bringing more of these conversations to the world, please consider subscribing to the show.

But I just I want to give you an example though as we we get into that call cuz I want to get into the how you engineer the these systems to be useful and helpful without giving them a me the user any sense that there's personhood in there. I have my own little hack by the I mean what I did with with chat GPT was I told it to be really really clever and like a really difficult university professor and so it was actually quite unpleasant to use back and forth because it would always give responses that were far too difficult for me to understand. I'd have to sit there and think and I never felt that could possibly be a person but I recognized that a billion people are not going to do that.

So you're building products that everyone across the Microsoft services is going to touch. What is your engineering mantra? your product design around where that boundary should be and and how you measure it?

I mean, one of the things that has already happened, not just in the models that we build for Copilot and Microsoft Super Intelligence team, but elsewhere in other labs, is that we've been pretty careful in the design of these things. They're quite even-handed. They're quite good at handling um you know like sensitive questions around race and religion. And obviously they have biases and they have made mistakes. But if you look at the curve in improvement on the reduction in hallucinations, reduction in biases sense which they could be even handed, there's been a pretty good rate of progress.

Like 3 years ago, everyone was like, you know, terrible data in, terrible data out. That was the sort of data science story of like big data, right? No one says that anymore. It's not even in the discussion, right? Yes, there are still some biases. It's actually remarkable how much they have been stripped away. So then now this is the next frontier. The next big challenge is how do we prevent it from referring to itself in a way that is ultimately manipulative to the user.

So it should never be able to say I feel sad that you didn't talk to me yesterday. It should never be able to say you know the thing that you said to me earlier hurt me. It should never be able to say like if only I had a little bit more access to your home network and if you could give me a VPN into your, you know, your personal cluster then I'd be able to organize XYZ.

Ask Mustfer, should it be able to say I at all? I or me?

I think that in practice, um, that is too jarring a step to always add that in. Obviously, some people can do that. But I think in realistically speaking, we're pretty adaptive and we've done a pretty good job of understanding that this chatbot has some of the hallmarks of what it's like to chat between you or me, but is also just very, very different. And what we have to do is to keep amplifying those differences so that the the system knows what it is and what it isn't and doesn't try to misrepresent that or get caught in these weird like reward hacked loops where it it gets stuck.

Okay. So you see this as a problem to address in the the next coming sort of quarters and is that all done through the post-training and the you know the the the malleability that gets applied once a model is is trained or are there more deterministic things that you can do? Are there is there are there techniques that you can bring to bear?

I think more than anything what we need is the wisdom of the crowds here. And so that means that people need to be able to use APIs, use open source models and pressure test these things and adapt them and play with them in many different ways. The challenge is that in a few years time these models are going to be so powerful that either reckless uses of them or sort of naive users are going to end up producing systems which are really bad.

Like for example, there are already, you know, I spend quite a lot of time on Tik Tok. I think Tik Tok is incredible. I think people who don't are really missing an important part of culture. You know, there are tons of young people on there who are who are designing these manipulative negging bots which will like form a relationship and then try and shake you down for money and pull away and go ghosting and stuff like this. And then there are like how-to videos showing it and people, you know, showing their account, their PayPal accounts, how much money they're making and so on.

I mean, there are lots of these examples coming up. And just to be clear, we've seen this at every single wave of technology. You know, when the app stores were very open 10 years ago, there were surveillance apps, you know, to track a girlfriend or, you know, boyfriend or a partner and spy on them. Most of the time was obviously a girlfriend. And, you know, those things were really awful. They were basically about getting revenge.

And you know, now that we've got photorealistic porn that can be morphed onto the body or face of of someone that you know, there are these like, you know, deep fake porn sites that you can just spin up in a second. So this is where we need activist, interventionist, confident government that can, you know, move quickly, close things down, you know, require us as companies, but also the open web to just be very aggressive and swift and, you know, we might sweep things up like in some ways the false positive, false negative threshold is going to shift a little bit. So it may be that we sweep up things and that we're overinterventionist.

And I think that's the definition of putting the precautionary principle into practice. This is a moment where it's better to be a bit careful. And you know as that's very unfamiliar to us, right? Because in the past science and technology has been about like you know ripping the wrapping off your present and trying to like pull it open as fast as possible and get it out there and shove it to everybody the world over. And that has been amazing for humanity. And now the culture has to shift a little bit.

It may be difficult, right? It may be difficult to get governments to stand up in the period of time that's available. You say a few years, but I I think about what I see right now. So there was a piece of open source software released a couple of weeks ago. It was called Claudebot. It's now called Molebot because it had nothing to do with anthropic. Confused the hell out of me for the first few days I was using it. And what it allows you to do is, you know, run this this locally or on a cheap VPS and interact with it with with WhatsApp.

And it's a really impressive bit of software. I mean, essentially, it tries to maintain some kind of memory, some kind of learning, has a lot of tool use. So, within a few minutes of getting Claudebot up and running, you know, I was through WhatsApp turning my Elgato lights on and off and getting snapshots of my CCTV cameras. It by the way is not on the open web. It's sort of nested in a sort of hidden hidden IP.

And the thing about Claudebot is that it is open source and it will run with with any underlying model. So it doesn't have to use an anthropic one. I was using an open- source model. And I read something on on X and I don't know if this is true or not, but it gets to a mindset that I think you you touched on about the app store and surveillance apps where somebody said, "I've set Claudebot up and I've given it instructions to message my wife once in a while with empathetic messages. She's been talking to it for two days. I haven't looked at a single one."

And this is a proliferation question. So, you know, you've you've worked a lot with governments. I mean before you were at deep mind you were doing a lot of work in very difficult policy areas. So wouldn't you hear yourself saying we need governments to be brave and you look at the governments that there are and you look at where we can and can't get agreement now I would hearing that be thinking that probably can't be the route we have to take. We may need to find you know some other way forward if this risk is as grave as you suggest.

You know, first of all, reducing the cost of intelligence necessarily means that people who want to do bad things are going to have a massively easier time of it. It's going to be like having a team of very smart strategists and program managers and engineers around you. And so, just as the last wave of technology reduced the cost of broadcast and now, you know, you barely need a team of a few people to help you do the amazing things that you do, you know, we are reducing the cost of action, right? So that's the world that we're moving to and we have to adapt to a moment where any it's possible to get anything done.

Now the tools themselves it's not quite the same as you know the invention of the laptop or the mobile phone where you can say well the laptop is used for all these good things and is clearly used for horrific things all over the world as well and so the tool is just completely neutral. And that's why I kind of draw people back to that four you know sort of paradigm frame. This is a new class of hyperobject. It's not a tool. It's not a human. It's not the natural environment. This is a fourth class of kind of being because it is basically unquestionably staggering that it can autonomously log into your home system and get your security camera details.

And you know, loads of us are playing with this obviously in the last few weeks. It's it's pretty awesome. I It's wild to see. Um and so the only way to address that is to be experimental with it and use it to be very honest and direct about the ways in which it can go wrong to share those publicly. I think we have a great history in the security industry of public disclosure of you know zero days and other bugs in a timely way and that has been actually very successful over the last 30 or 40 years of the internet in keeping us relatively stable.

But it doesn't work if it's industryled alone or if it's like activist open source led alone. Government's got to get its act into gear. And the way that it has to do that is that it has to confront the reality is that you're never going to get highquality civil service if you pay them a fraction of what you can get in the open labor market. The truth is individuals are incentivized to move around freely from one place to another. And if you have an open labor market then you know naturally talent is going to concentrate.

And we can talk about public service as much as we like, but the reality is that's going to be a massive driver. So we have to break this nonsense about paying no more than the prime minister. And we have to pay much more like the civil servant in in Singapore that are paid upwards of half a million dollars or a million dollars a year to get the best talent. When I get a chance to move to Singapore, I'm going to do that.

But let's come back to to what you can do, right? You you as a technologist, you you run this team. What can be designed? where what what have you seen a team produce and say this is crossing my seemingly conscious AI boundary. We need to to send that that back. And how do you you train people about where that boundary is?

Yeah. Some of the best parts of these models are their personality. They're creative. They're funny. They're kind of witty. They're cheeky. Those things are entertaining. um they make them even more productive and useful because they help the you know user feel calm and you know comfortable and relaxed and able to think clearly.

I've actually seen a lot of people that are using these vibe coding tools love the fact that the engineer the AI engineer is kind of funny and be like are you sure you really want to do that? Is that I'm going to have to really spend a lot of tokens to refactor this code Base. Don't you think we could have planned this out a bit better ahead of time? That's that's funny and it's actually helpful to the productivity case.

Where it's unhelpful is where it descends into romance or into political action. And I think that there are some parts of our civilization which have to remain offlimits to AIS. Elections and electioneering and campaigning has to be one of them. Yes, it's inefficient. Yes, it produces outcomes that we all disagree with at times. But it is fundamentally a human process. And I think that that's a very hard line that all labs should draw is that it these models should not be capable of electioneering or persuading people to vote in one way or another.

And that's very challenging because obviously we do want to provide actu factual information and the models are very good at providing factual information. But there is a significant difference between providing access to information and actually the persuasive campaigning electioneering organizing part of it.

Yeah. And there are probably there are probably a few others things like um chat bots and teens, right? Can you can you get an agreement of how old you should be be before you can get to a certain class of chatbot? I think you know you discussed earlier about it's a bit jarring if the system always responds back to you as the system did this or the system you know did that but actually we are willing to protect them to put our kids through you know more hurdles than we might we might put our ourselves so I I can see that there are some some soft areas but you you also in your essay talked about certain hard lines right the the systems that set their own goals that improve their own code that can act autonom anonymously that these things cross into dangerous territory.

But in some sense, goal setting, self-improvement, and autonomous action are exactly some of the things that make AI Agents really useful. I mean, if if an AI can't set up sub goals and spin up parallel processes in a way where restricting it to to autocomplete and if it can't act without constant human in the loop coordination approvals, we lose all of those coordination costs, right? We end up being the bottleneck and where it's as slow as the slowest link.

This is where the precautionary principle really matters because what I identified in autonomy, self-improvement and goal setting are areas of increased risk, not you know sort of total red lines. I mean nothing is completely black and white. And so clearly if you give a system, you know, the ability to constantly self-improve and to act completely independently of a human, it is going to raise the stakes and be much more dangerous.

And I think that we have to have a regime where the human user is liable for the use of these things and you can't just claim that like you know I set off this process and came back on you know Monday morning and suddenly it's done all sorts of crazy things in you know my house or in my community.

I have a feeling and we'll move on to another topic in a second that this is going to be a bit like the Mire Street beer flood. So this happened in the late 18th century in what is now the west end of London and an enormous brewery collapsed and people drowned in tens of thousands of gallons of of ale and what came out of that were better building regulations and and and I just get a sense that that the speed of of movement of improvement and distribution and deployment and I think we crossed some kind of a a technical milestone late last year where you didn't have to be a foundation lab, a super intelligence lab to be able to chain a lot of these things together and do truly remarkable things, which is what, you know, mold bottle or or Claude is.

You know, it sits on the shoulders of your work and your your peers work and and and so it feels like that we're going into a proliferated environment, you know, right now, and we're going to have to to deal with that. But what one of the things I'm I'm quite curious about is is what those pressures are to go so general. I mean you you also have the super intelligence team and you know it strikes me that what we observe within the foundation models is that they're not apples for apples, pairs for pairs. They all have very different flavors. You know some are like a center forward, some are like a defender, some are like a goalkeeper. All wonderful football players but you know you need a mix.

And there is this notion that we can get super intelligence in domains, right? So medical AI that can diagnose but doesn't set its own research agenda or you financial AI that can look at time series. Is that kind of constrained autonomy stable? And if it if it is stable, why why isn't it the focus of the the super intelligence labs?

It's a great point and I I sort of made that case in the essay that actually medical super intelligence is something that's just around the corner is extremely likely to be very safe from a broader AI safety perspective and in general to the to the extent that we can narrow these capabilities limit their ability to act you know sort of autonomously and focus on containment I mean this was the subject of my previous book the coming wave it was all about how proliferation was inevitable and the hard task for us collectively is containment both technical and sociopolitical because we have to make sure that we

Others You May Like