
by Azeem Azhar
Date: October 2023
Quick Insight: Frontier AI models like OpenAI's GPT-5 show decent gross margins but struggle with overall profitability due to immense R&D costs and short model lifespans. This dynamic forces a strategic pivot towards infrastructure and enterprise solutions, rather than immediate consumer monetization, to sustain growth and investor confidence.
The AI industry is valued in the hundreds of billions, yet a critical question looms: do the economics work? Azeem Azhar, founder of Exponential View, with Epoch AI's Haime Sevilla and Hannah Petravich, dissects the financial realities of frontier AI models, revealing a complex picture of high costs, rapid innovation, and strategic bets.
"If you look at how much they spent in R&D in the four months before they released GPT5, that quantity was likely larger than what they made in gross profits during the whole tenure of GPT5 and GPT5.2."
"The game that they're playing is not so much about becoming profitable right away rather what they are trying to do is convince investors that they have a business... that's worth scaling as much as possible."
"The unit of persistence of these agents is not going to be so much the model that runs behind but the memory and the history of the models themselves."
Podcast Link: Click here to listen

Today, artificial intelligence companies are now being valued in the hundreds of billions of dollars.
It's open AI, it's anthropic, it's all the value that DeepMind has added to Google over the past years.
But that forces a really important question and it's a question that is being asked by the mainstream, but also by specialists.
Do the economics actually work? When you look at what it costs to train and run a frontier model and what you earn from it before the next model comes along and replaces it, is that a profitable business?
Are we looking at something a bit like Uber, which lost money for 14 years before turning a profit and is now handsomely valued, or something that doesn't have an end in sight?
Now, these questions really matter to the stock markets.
Well, big tech had a lurching week this week and at one point more than a trillion dollars was wiped off valuations.
Wall Street's very linear investors were trying to digest the $650 billion of capital expenditure commitments being made by big tech for 2026.
Some of that $650 billion is going towards AI infrastructure.
Does any of this make sense? Are there actually going to be operating margins to defend and is the revenue growth going to support this?
Now, as a reader of Exponential View, you'll know that we've been asking these questions for months, if not longer.
But most recently, we partnered with EPO AI. I'm sure everybody knows EPO, but if you don't know, they are really preeminent independent research organization tracking some of the trends behind AI.
You've probably seen their work on scaling laws and compute trends.
So, we worked with their team to dig into the actual margins of Frontier AI, and the results are really, really interesting.
So, whether or not you've had a chance to read our research yet, and you really should have done, this conversation will give you a really clear picture of where things stand and where they're heading.
I've asked financial journalist Matt Robinson from AI Street, it's another Substack newsletter, to moderate this discussion and really put us on the spot.
Hi Sevilla, the founder of Epoch AI is here and Hannah Petravich from my team.
She led the research on the exponential view side. She's also no stranger to large numbers. She has a doctorate in astrophysics.
So when I say, well, is that roughly right for Hanitz? Is it within a few orders of magnitude? Often yes.
Matt, I'm handing the stage over to you. The floor is yours.
Maybe you guys could start with, you know, for someone who's getting into the research, what was what's the big takeaway here and how did you even think about building a framework to analyze a business like this?
Absolutely. So Matt, a little bit here of the context of why we were doing this before I get into the takeaway.
To our understanding, no one had really taken on this humongous task of piecing together all the public information that there is about the finances of OpenAI or any large AI company really and trying to paint a picture of whether what are their margins like what whether they are making enough money to recoup the large cost of developing new products.
So we did this hermenotic exercise of just hunting for all the information that we could find and trying to make sense of it.
Now I won't pretend that we have arrived at the a definitive answer. In fact our views are constantly evolving as we learn more about about the companies and their finances.
But I'm pretty happy with the overall framework and that we have established for even trying to think about this question in the first place.
If I was trying to now tell tell communicate like okay in summary what did we learn and what did we find?
For me the most the two most important takeaways is that one it seems likely that open AI during the last year and especially while operating **GPT5** was making more money than the cost of the compute which is the primary expense of operating their product.
Though they seem to have made like a very small margin or even having lost money after accounting for all the other operating expenses that going to run in the model.
So this is paying for staff. This is sales and marketing spending. This is administrative cost and this also includes the revenue sharing agreement that they have with Microsoft.
Now the raw profitability, the operating margins of a company are not necessarily what you want to look at when you are trying to assess whether the company will be profitable in the long term.
As a seam alluded to earlier, Uber lost billions upon billions of dollars before they finally became profitable.
And really if you are if you're an investmentminded person when you are looking at a growing business you do not look so much at how profit how much profit they're turning in their early years where they're still growing but rather you you will rather you would rather look at the gross profit that they're making at their gross margins and how the revenue is scaling year after year.
So you can get a sense of after this initial phase of a rapid growth where the industry and the company will land at.
Now if you did that then I have just said like okay they look they look to have made like a decent gross margin but there is one more wrinkle in that you need to account for here Matt which is that these products are really really expensive to develop and they have a very short self life.
So, it's not enough to just look at the gross profit and check like, okay, it seems that they have this 50% gross profit. Seems they're getting twice as much money as you put into the into the machine.
Like no, you actually need to think about how much money does it take to develop a new model and how long could you expect that model to stay relevant before it becomes obsolete by your competitors or by open weight alternatives that will make a dent to the usage of your of your model.
So this is the the second part of our research where like we try to look at how much open is spending in R&D and how that compares to the gross profits overall.
And what we found is quite shocking.
So if you look at how much they spent in R&D in the four months before they released **GPT5**, that quantity was likely larger than what they made in gross profits during the whole tenure of **GPT5** and **GPT5**.2, which points to how competitive this space has become in the last couple of years.
Hannah, why don't you just I I know that you dug into this in such detail and I don't want to speak ahead of of your expertise to Matt's question, the methodology, how we actually got into it and a lot of it was based on numbers that we could find in the past historically and then trying to predict what would happen in the rest of 2025.
So for example the sales and marketing we had some data that 2024 was 1 billion in sales and marketing and then in H1 of 2025 that was 2 billion for example.
So we can build the picture using constraints in this way and from that you can try and understand the costs of the company as a whole and we broke it down into many categories as you can tell in the piece.
But each of those as well I tried to break down further into the different separate components so that we could realistically understand whether or not it was you know feasible or you know a realistic approximation at least.
This is a kind of complicated exercise and one of the things that that comes out from this is this question of that short model life and you know the family that we looked at was only really the preeminent family for a few months.
Now, we know that enterprises and even enterprises don't change the the API they're using the day a new one comes out, there's always a bit of a lag, but consumers do, right? Because that's what you get access to on on chat GPT.
And you may remember that when **GPT4** was set aside from chat GPT. It was an emotional support tool for many users and they were very upset with how methodical and mechanical **GPT5** now felt.
And I think one of the uncertainties is to what extent do you actually learn and prepare for your next model based on the short life of the existing model?
Right? There are a couple of elements to it. Right?
One I think is a little bit more nebulous which is that by having a really good model even if it lasts for a short period of time you maintain your forward momentum in the market in terms of customers liking you and your enterprise sales and so on.
And that feels less tangible than the second bit that I think is perhaps a bit harder to unpick, which is what do you learn about running better and better models from actually having run a a better model even if only lasts for for four months.
And that learning might be sort of down in the weeds in sort of R&D and and you know particular choices you you make in training data and reinforcement learning.
It might also be in in operations, right? just operating a model of that scale.
And I think it's quite hard for us to to know. I suspect it's hard for OpenAI or any of the other foundation models to know the contribution of that second part to to the model itself.
Right. So in a sense, who in this kingdom is is actually able to see with with two eyes? I'm not sure, you know, many can at this point.
It's interesting. It was making me think of GPUs and how do you I was talking to some some finance folks about okay well what is the value of these **H100** chips going to be in a few years and everyone's kind of shrugging their shoulders like you know and and sort of putting it out there and it's kind of seemingly like a parallel to these models like what is the value of you know **G2P4** like three years ago was you know so how do how do you think about that one one question I have is like you know you talked a bit about compute and costs there and you know this may be a little down in the weeds but the cost of compute in sort of building these models is going down and how do you sort of see that kind of going forward so the cost of compute of building these models I don't think it's quite going down I do see it as going up time time and again the pre the pre-training seems to be going up and despite rumors to the contrary pre-training is not dead at all people are building 100 billion data centers for a reason they are invested in running very large scale experiments and very large scale training runs that are unprecedented inside and I I think that this is part of what contributes to these models being so expensive like one one of the interesting things here when I think about OpenAI is the game that they're playing the game that they're playing is not so much about becoming profitable right away rather what they are trying to do is convince investors that they have a business they have a a business and a research product that's worth scaling as much as possible like driven by this conviction that through a scale they're going to unlock new capabilities that in turn is going to unlock new markets and let them continue their incredible incredible revenue growth.
I would say I want to come to I think that's exactly the right thing for them to do anyway.
You know, as an investor myself, I want to invest in people who have optimistic views of the future and therefore believe you need to plant seeds today in order to harvest them in in two or three or four years.
And you know, particularly in a business like this where there is no asset, you know, there's no there's no hotel that's been built that can be resold to to another property de developer.
You know, it's an intangible asset that may not have that much salvage value, especially if people in the team leave.
I think the exactly the right thing to do is to to be building out ahead of time and you see that investment J curve.
I think the the the two kind of challenges around this model are number one is the open AI model the only way to to do this and I don't just mean from the financial side I also mean from the sort of strategic focus side we've seen anthropic do something completely different and the second challenge is and I think we we went part of the way to answering this question is is there a path to positive unit economics in other words are they producing something for X that they can sell for $1.3 or they producing something for X dollars that they sell for half an X dollar which was the story of a lot of the.com right cosmo.com and all these other things and I think we got partway to answering that second question which is that yes it's expensive yes there is um you know some kind of gross profit margin the level that we estimate I think Hannah can speak more accurately to this is is you know lower than a traditional software business so we're learning that perhaps foundation labs don't look like software businesses, they look like something different.
But but you know, these are the things I think that we have to we have to play around with.
Yeah, spot on with the the numbers there.
The other thing I would like to also consider is that AI is also creating a flywheel in the development space of itself.
So I'm wondering how that might affect R&D down the line given that R&D is such a huge development cost to the company for the next model.
That's just a thought there.
I'm curious, you the open ads got a little bit of slack for saying that they may introduce ads which is I thought kind of peculiar that they did because we all been I mean I've been using Gmail for 20 years and actually in preparing for this I I I stumbled upon some research from Google Larry Larry Britain sort about how they were sort of against ads in the beginning and they you know changed their minds and I'm just curious how you think about ads and how that you know if you have what they have 800 million eyeballs every week or you how that would might play into this.
Let's think about why OpenAI is trying to introduce ads in the first place.
Because with these ads, I get that you said it yourself. It seems that they have on the order of almost a billion users that they could that they could monetize through ads.
With their monetization plans, it seems that they might be able to reap a revenue of like maybe a couple billion, maybe even up to a few dozen billions of dollars off of that audience.
That's not enough. If your plan is to build hundred billion dollar data centers, that's not going to be enough to fund that.
So why are they considering ads in the first place?
I think this has to do a lot with this game that we're alluding to where they are not looking to become profitable right away, but they have this vested interest in demonstrating to investors that if we could be if we could if we wanted like we don't want right now, but if we wanted we could be we have a path to profitability and the ads kind of fit into that into that project plan.
is part of the a way of expanding their market that's going to allow them to show like look there's a path to $100 billion revenue between the ads between business sales between other markets that we could be unlocking here like these models are not profitable right now but they have these arguments that they can make including ads that point to that that help them argue like look not right now but they could be we are not going to we're not going to do that because we're more ambitious than that but we could be profitable if we wanted we We've seen some success with with ads and genai I think um was it was in meta wasn't it that they have really had some sort of forward momentum.
Yeah. So in meta's earnings in October they commented that their AI tool driven ads were essentially bringing in 60 billion of revenue that and in ARR.
So there is a considerable uplift already in the ad space there but it's this will be the first time that it will be in the chat one and that was helping with conversions right with Meta's ads.
Yeah. And it was also that Meta were able to keep people on like Instagram on YouTube, not YouTube, Instagram and Facebook longer, so people were seeing more of these ads as well, right?
I guess everyone can just make up their own ad. And you know, it's a lot easier to to do that and and you're you're sort of you're stuck in there.
But I think that this this question about the ads is is a really important one.
I think Hime what you've suggested is is really intriguing. So it is it is the question as to whether an advertising model is really really fundamental to open AI or whether it is it's a sort of instrumentally useful thing that gets you to to the next stage and and you know I think a lot of that depends on on how we start to use these these tools.
I mean you know the piece of work that epoch and exponential view did looked at ancient history with all due respect right it was it was before last week um right it was before openclaw it was before opus 4.6 six and you know whatever else anthropic comes out with and we looked at a particular world before what I think Andre Karpathy called the threshold of coherence for **Agents** and he described this moment where the **Agents** are now good enough that you can get them to do lots and lots of things for you and and so that also makes me wonder whether that traditional ad model makes any sense because there ain't going to be any eyeballs just sell it to the **Agents** sell it to the **Agents** who have probably rented humans to do the jobs they can't do themselves.
You guys have spent, you know, you spent a lot of time doing some rigorous work here.
Say you swap seats with Sam Alman. What do you do differently or what do you keep the same?
I mean, first of all, I will just go and look at their finances and actually get a clear picture of what's going on after having done that uh what I will do in Sam Alman's in in Sam Alman's position like honestly it seems kind of similar to what they they seem to be doing already for me uh this question of the models as a a rapidly depreciating asset actually brings a little bit into focus of what might be the the pduring asset the part that might retain more value through generations of AI and it seems to me that this part is infrastructure and they're gearing big time to break into the infrastructure space.
They have famously said like oh we want to get to a position where we are building gigawatts of power at a time which is really ambitious a very ambitious goal but it makes sense from my perspective if you think that the software part is rapidly depreciating you might want to get in on this part of create building and serving infrastructure at a scale.
Yeah. So if I was to bring in a different view here, obviously their consumer section proportion is quite large and we know from Sarah Frier that about 60% is now consumer 40% enterprise.
So obviously the enterprise push is there and we know that the enterprise push would be you know bringing in money for the company quite well. the consumer side is very competitive given Gemini and you know other AIs which you can easily have on your device say like Samsung I just hold my finger over a button and I have access to Gemini so there's very little friction in using it but so if I was Sam Alman I would want to try to see if I could do something different on the consumer side and obviously we know like they're hoping to bring out a device that is a unique different side of targeting that consumer component and if there's other things they can do there that would keep you know their consumer money coming in.
Okay, we've had two different views here. Matt, I'm going to give you a third view just to really make you work hard for your moderator's seat.
Let's take this idea of that Hannah raised which is like different classes of interactions for for the end user whether it's consumer or business and the point that Haime made which was look the infrastructure really really matters and the infrastructure obviously matters an enormous amount because in the last week or two if you've been using anthropic it got really slow because we all got excited about Opus 4.5 there is a mo and then the question is well where does the revenue come from like where is the point at which people start to spend more and more.
And one thing I would say from just looking at the exponential view bills, our bills have gone up since OPAS 4.5 came out, okay? Because everyone is coding more. We're running many more background processes that are chewing through tokens.
And I thought that was all true until I installed my open claw bot. Actually, it was called um what was it called? Claude initially.
And I've called mine mini Arnold in homage to the second Terminator that came back to protect us.
And but Mini Arnold is a greedy and forgive my French mofo. He he will chew through 20 to $30 of tokens a day.
So we're talking five grand a year in order to do my bits and pieces. And I've pushed him down to Haiku, which is the cheapest anthropic model.
I I do the heartbeat on a local LLM so that you know every 30 minutes I'm not having to pay pay for that. It it is expensive.
Now what drove that? What drove that was that idea that the models were just just good enough like they they crossed that uncanny valley.
And when we did the work on OpenAI, their models hadn't crossed that uncanny valley, right? GPC5 was not the thing you could leave to run for hours at a time. 4.5 Opus from Claude from Anthropic was was really the first and I'm just curious my my sense would be that all of this comp discussion starts to look very different when OpenAI is shipping things that run 5 6 9 hours at a time because at that point actually you know the inertia of being an OpenAI user through the enterprise or through the customer sticks with you and and the only thing I would say is just think about dear old mini Arnold who cost me the same as you know four Starbucks flat whites a a to do whatever the hell he's doing in his Mac Mini.
I I don't know what he's doing. I don't ask. It's his private space. You know, get get on with it, Mini Arnold.
So that is that for me is like is is how we we merge you know Hannah's observation like kind of the user experience the user interactions patterns with Haime's point which is this is all about infrastructure because ultimately all that processing has to happen somewhere and that becomes a choke point and and to that you know this week as you mentioned earlier that you know the markets were caught flatfooted I guess to say the least about this you know everexpanding compute spend to me what's interesting is that you know as the hyperscalers just reported they're they're capacity constrained right you know we're seeing these these big rollouts where where you know capex is huge but yet they can't meet demand right and I think that you know maybe it's a little separate sort of beyond open AI but I'm just curious like how do you see just it's just wild that they're spending this much money and they just can't catch up there's a lot here Matt and I just see that the two primary constraints here if you want to scale up the infrastructure is you need enough GPUs and you need enough energy it seems to me that energy right now is the thing that everyone talks about it's something that we we know how to solve.
We know how to build energy. Like you don't need that much energy all things considered. If we need to build 10 GB, 100 GB of extra power, that's only a 10% increase over all installed capacity in the US.
This has happened in the past in the in the 2000s. They built enough gas infrastructure to match that level of expansion.
The the GPU part though, that's something very unique. That's something that right now is being chocked hold on production in a few factories in Taiwan and they have been trying really hard to expand on it with pretty limited success.
So it feels to me that that's probably where the where where the the bottleneck to scaling is going to end up being in the long term.
So I love what you've just said H because of course you know the the the general note out there is it's all about the energy like energy is the is the bottleneck and I think it's it's pretty clear that that energy is is constrained because of lead times and and so on but if you listen to Elon Musk talk about why he wants to put data centers in space every single thing reason comes down to things we've done to ourselves uh grid permitting backlogs I mean these things are cues they're not walls they're not laws physics, you know, the laws of physics are, you know, black body radiation and the speed of light and all these other things that Hannah knows much more about than I do.
And so I I think that there is something to be said that it's very exciting if you're in the energy space to suddenly be important because we'd rather forgotten about you for the last, you know, 10 years or so.
And, you know, Europe hadn't really thought much about its energy and was caught flatfooted by, you know, the the gentleman from Moscow. and and I think the the so they everyone has got really really excited about that question and and as Haimey says it it feels like it it's solvable.
I would I would push a little bit on on that because there's just a lot of you know supply chain questions that have to get fixed the the copper issue right you suddenly we all know about copper and and about you know optical fiber and who ever thought about corning honestly before 2 months ago so there is something there but but I I think that what I also take away from this is let's go back and find out when these companies started talking about megawatts and gigawatts because I'm pretty certain they were not saying it at the end of 24 and I And I have to go back on my notes.
I think the first time I started hearing them talk publicly about gigawatts was, you know, I met with with Satia in January of 2025 and and I would say that it was after that that I started seeing Microsoft talking quite a lot about, you know, megawatts and gigawatts and or as Doc Brown from Back to the Future would say, gigawatts.
And so this is kind of new to them and it's also new to the the energy business.
But there's definitely I think this is the thing that the markets didn't get at the start of this week, which is this these these hyperscalers are absolutely supply constrained. They can't don't have enough chips and they can't energize the chips they they have.
And you heard Amy Hood, who is the Microsoft CFO, say, "I had to make a choice of whether I put processing power to third party services on Azure or power Microsoft Office or first party apps."
And I had that trade-off. I mean, this is this is not a market which is not being, you know, hasn't hasn't got people running after it trying to spend money.
And, you know, I think that's Hannah, you know, you've you sent me your kind of latest analysis of the market dynamics. I mean that's exactly what what we're seeing.
If I was to add anything I guess I'm wondering how things will move also to the edge.
So obviously you have a lot of this buildout in hypers in for the hyperscalers in data centers but we commented on a you said you were running an LLM on device. At what point will they get better to the state that you can actually run, you know, the things you're doing now on your device with hardware improvements that are coming and the algorithmic improvements which are also coming and I wonder at what point we can do most of the things we're doing now on our device.
Yeah. Well, it it's this it's actually very interesting if you let me to to build on top of this because if you look at a fixed level of capabilities, you see this rapid growth where in order to achieve what models could do 9 months ago, you already have pretty much an an open model that's I mean it's it's going to it's going to be kind of there, right?
If you look at the KI 2.5 model, like it's arguably at the 03 level. The O3 being the model that OpenAI launched in April last year.
If you look at that then you see this rapid decrease this rapid decrease in the amount of resources that you need in order to train and to deploy a model at a fixed level of capabilities but it's not a fixed level of capabilities what's driving the growth of the industry and the growth in revenue this everinccreasing m of capabilities that we have like we are building more and more we're expending we're spending more and more in building more sophisticated machines and I think this is going to cut against the gradient of moving things to the ads like you all of these new exciting capabilities like you're just going to want to run it on your data centers.
So you just want to run it in your data centers for a very long amount of time. Like a seim is talking about like oh I may be looking at a at a bill of $5,000 a year for running my agent.
It's like that seems very small compared to what these machines could do in the future. Like if you get to the point where these machines have an output which is comparable to a worker like how much will you pay to have a virtual coworker which like really knowledgeable who is like available 24/7 like you might be willing to spend on the order of hundred of thousands of dollars a year just to keep those kind of agents running and there's huge advantages to running them in a in a data center like the biggest disadvantage here the one that the one that physics won't allow you to overcome is latency but right now the models are already slow enough that I don't mind waiting like 200 milliseconds for the responses to get to me.
Like it's fine. They can take they can take I I leave my **GPT5** uh pro thinking in the background for for 20 minutes and I get back to the answer. I'm not in a hurry.
I think that the the compute demand has been sort of under reportported.
We all know about capex, but I've talked to folks in the enterprise that even when they're not using it, they don't they don't want to give it up. They don't want to lose their spot.
So there's just that constant demand for access. Seems like the same way that planes were still flying in CO so that they could keep their slight roots even though no one was flying.
Yeah. They didn't they didn't want to lose it. I I just Yeah. That that part of the story is is sort of uh interesting to me and you know we saw so much whiplash in the market this week about about what's happening here.
You know there's a bunch of other things going on like US levels of debt and what's going to happen with you know employment or or or not.
But I think at the heart of it is that you know on this call are people who are in general probably more closely attuned to what's going on and the trends that we that we see.
And you know I think I think to to Haime's point about us not wanting to give up capabilities it's it's absolutely right.
I mean I have um you know I have a prompt about a model evaluator that I that I built and you know occasionally I throw something back to a **GPT4** class model and it's kind of moronic the response I don't want to deal with this any anymore and the one place within you know within my team where we do use quite a lot of of models but you know there is a lot of batch processing and we'll throw that into DeepSeek 3.2 to and that can be cheap but we're never going to put that on the edge because it's it's like batch processing why would you put your kind of core infrastructure on edge devices and then I think people are a bit disciplined more disciplined about how long do you want to wait and how much do you want to pay for a particular class of output and as you start to get more and more value from the output you're willing to pay more and I think that for a lot of companies and maybe this is true for many investors who are still sagging on a Microsoft co-pilot license.
They've never really had the breakthrough moment of getting cla to do 10 hours of tedious manual work in as Haimey points out 354 minutes while you go off and do something else.
And once you do that, you sit and you say, "Well, actually, it's worth paying £75 or 100 bucks a month for Claude Max in order to to to take this, you know, off my desk."
And and so when I looked at what what happened in the markets this week, you know, this was an overreaction. I mean, the market is always right. So let's just sort of get this right. Market is never wrong.
You can never bet against the market. They will always stay solvent longer than you will. However, with that said, they didn't get the demand growth. They didn't get the way in which that demand is outstripping supply. They didn't get how much more we were going to demand as these models get better.
I mean, the moment a model can work for 20 hours, I can tell you, I mean, I will be running hundreds of these things because I've got a lot of work to get through and I'll be, you know, saying, "Hannah, how many models are you running?"
I think I've probably sent that to you already. We we just introduced uh if you don't max out your Claude usage at least once a month, you lose Claude Max as a tier. You've got to be maxing these things out because they're so powerful.
And I think that once those that realization moves into