Why behind AI: The economics of AI
Automating everything that can be verified...in a flash
As AI exploded into the public consciousness, one of the most difficult things has been understanding the big picture. I’ve gained perspective in this direction from my time in the trenches of cloud infrastructure software and through research as part of writing the Infra Play, but probably the biggest “jumps” in understanding have come from seeing first principles thinking in action.
In a recent interview with Patrick O’Shaughnessy, Gavin Baker shared a lot of important observations as an industry insider and somebody trying to capture the big picture and get an edge in the context of investing in AI. While I recommend listening to the whole thing, I’ve picked up some of the key themes that touch on cloud infrastructure software.
Patrick
I would love to talk about how you, like, in the nitty-gritty process, new things that come out in this AI world, because it’s happening so constantly. I’m extremely interested in it, and I find it very hard to keep up, and I have a couple blogs that I go read and friends that I call.But maybe let’s take Gemini 3 as a recent example. When that comes out, take me into your office. Like, what are you doing? How do you and your team process an update like that? Given how often these things are happening.
Gavin
I think the first thing is you have to use it yourself. And I would just say I’m amazed at how many famous and august investors are reaching really definitive conclusions about AI based on the free tier. The free tier is, like, you’re dealing with a 10 year old.
And you’re making conclusions about the 10 year old’s capabilities as an adult, and you could just pay, and I do think actually you do need to pay for the highest tier, whether it’s Gemini Ultra, Super Grok, whatever it is, you have to pay the $200 per month tiers, whereas those are, like, a fully-fledged, 30, 35 year old. It’s really hard to extrapolate from an eight or a 10 year old to the 35 year old, and yet, a lot of people are doing that.The second thing is there was an insider post about OpenAI, and they said to a large degree, “OpenAI runs on Twitter vibes.” And I just think AI happens on X. There have been some really memorable moments.
I’ve written before about the fact that I only use the paid tiers of all model providers (OpenAI, Anthropic, Google and xAI), as well as select tools built on top such as Cursor and Perplexity. For ChatGPT I also leverage their highest tier ($200) because it gives me access to the best deep research in the industry, as well as their Pro model, which is the most powerful AI you can use. While technically both Google and xAI offer scaled versions of their models (called Deep Think and Heavy), feedback from other practitioners and benchmarks have not really justified upgrading those as an overall commercial package.
One of the most common conversations I have about AI almost always includes somebody stating “the models are really bad and don’t work at all,” which after some Socratic questioning will typically yield one of three outcomes:
They used it a year ago and never tried again.
They used the cheapest model and didn’t even understand what a reasoning version is.
They had a very specific and narrow edge case, tried only one model and never tested it again.
Patrick
Crazy. Okay. So, something like Gemini 3 comes out with the public interpretation was, “Oh, this is interesting. It seems to say something about scaling laws and the pre-training stuff.” What is your frame on the state of general progress in frontier models in general? What are you watching most closely?Gavin
Yeah. Well, I do think Gemini 3 was very important, because it showed us that scaling laws for pre-training are intact. They stated that unequivocally. And that’s important, because no one on planet Earth knows how or why scaling laws for pre-training work. It’s actually not a law. It’s an empirical observation, and it’s an empirical observations that we’ve measured extremely precisely and is held for a long time.But our understanding of scaling laws for pre-training, and maybe this is a little bit controversial with 20% of researchers, but probably not more than that, it’s, like, the ancient British people’s understanding of the sun. Are the ancient Egyptians understanding the sun? They can measure tit so precisely that the east/west axis of the Great Pyramids are perfectly aligned with the equinoxes, and so, are the east/west axis of Stonehenge. Perfect measurement.
They didn’t understand orbital mechanics. They had no idea how or why it rose in the east, set in the west, and moved across the horizon.
Patrick
Because the aliens.Gavin
Yeah. Or God in a chariot. And so, it’s really important every time we get a confirmation of that. So, Gemini 3 was very important in that way. But I’d say I think there’s been a big misunderstanding of maybe in the public equity investing community, or the broader more generalist community, based on the scaling laws of pre-training, there really should have been no progress in ‘24 and ‘25.And the reason for that is after xAI figured out how to get 200,000 Hoppers coherent, you had to wait for the next generation of chips, because you really can’t get more than 200,000 Hoppers coherent. And coherent just means you can just think of it as HTPU knows what every other GPU is thinking. They are sharing memory. They’re connected. They scale up networks, and scale out, and they have to be coherent during the pre-training process.
The reason we’ve had all this progress, and maybe we could show, like, the arc AGI slide where you had zero to eight over four years, zero to 8% intelligence, and then you went from 8% to 95% in three months when the first reasoning model came out from OpenAI. We have these two new scaling laws of post-training, which is just reinforcement learning with verified rewards. And verified is such an important concept in AI. Like, one of Karpathy’s great things was with software. Anything you can specify you can automate. With AI, anything you can verify you can automate. It’s such an important concept, and I think important distinction.
And then test-time compute. And so, all the progress we’ve had, and we’ve had immense progress since October ‘24 through today, was based entirely on these two new scaling laws. And Gemini 3 was arguably the first test since Hopper came out of the scaling law for pre-training, and it held, and that’s great, because all these scaling laws are multiplicative. So, now we’re going to apply these two new reinforcement learnings, verifiable rewards and test-time compute, to much better base models.
There’s a lot of misunderstanding about Gemini 3 that I think is really important. So, the most important thing to conceptualize, everything in AI has a struggle between Google and Nvidia. And Google has TPU, Nvidia has their GPUs and Google only has a TPU, and they use a bunch of other chips for networking. Nvidia has the full stack. Blackwell was delayed. Blackwell was Nvidia’s next generation chip. The first iteration of that was the Blackwell 200. A lot of different SKUs were canceled.
And the reason for that is it was, by far, the most complex product transition we’ve ever gone through in technology. Going from Hopper to Blackwell, first you go from air-cooled to liquid-cooled. The rack goes from weighing round numbers, 1000 pounds to 3000 pounds, goes from round numbers 30 kilowatts, which was 30 American homes, to 130 kilowatts, which is 130 American homes.
I analogize it to imagine if to get a new iPhone, you had to change all the outlets in your house to 220 volt, put in a Tesla power wall, put in a generator, put in solar panels, that’s the power, put in a whole home humidification system, and then reinforce the floor, because the floor can’t handle this.
So, it was a huge product transition. And then just the rack was so dense it was really hard for them to get the heat out. So, Blackwells have only really started to be deployed and really scaled deployments over the last three or four months.
Now this is a very interesting angle. When a new model is released and we start using it, there are a lot of opinions between how it benchmarks and how it performs in day-to-day usage. While the important part is actual usability, that depends on a lot of factors and is often heavily influenced by the ability of the frontier lab to provide performant inference. Base model performance has trended higher but not exceptionally (which most see as “the models plateauing”), although we’ve seen significant gains in coding and other areas that have benefited from reasoning.
Patrick
Can you explain why it has been such an important thing that Blackwell was delayed?Gavin
Because Blackwell was so complicated, and it was so hard for everyone to get these exquisitely complex racks working consistently, had reasoning not come along, there would have been no AI progress from mid-2024 through, essentially, Gemini 3. There would have been none. Everything would have stalled.And can you imagine what that would have meant to the markets? For sure, we would have lived in a very different environment. So, Reasoning bridged this 18 month gap, Reasoning saved AI, because it let AI make progress without Blackwell, or the next generation of TPU, which were necessary for the scaling laws for pre-training to continue.
Google came out with a TPU v6 in 2024, and the TPU v7 in 2025. In semiconductor time, imagine Hopper. It’s, like, a World War II era airplane. And it was, by far, the best World War II era airplane. It’s a P-51 Mustang with a Merlin engine. And two years later in semiconductor time, that’s, like, you’re an F-4 Phantom, okay?
Because Blackwell was such a complicated product and so hard to ramp, Google was training Gemini 3 on ‘24 and ‘25 era TPUs, which are, like, F-4 Phantoms. Blackwell, it’s, like an F-35. It just took a really long time to get it going.
So, I think Google for sure has this temporary advantage right now from a pre-training perspective. I think it’s also important that they’ve been the lowest cost producer of tokens. And this is really important, because AI is the first time in my career as a tech investor that being the low-cost producer has ever mattered.
Apple is not worth trillions because they’re a low-cost producer of phones. Microsoft is not worth trillions, because they’re a low-cost producer of software. Nvidia is not worth trillions because they’re the low-cost producer of AI accelerators.
It’s never mattered. And this is really important, because what Google has been doing as the low-cost producer is they have been sucking the economic oxygen out of the AI ecosystem, which is an extremely rational strategy for them. And for anyone who is a low-cost producer. Let’s make life really hard for our competitors.
So, what happens now? I think this has pretty profound implications. One, we will see the first models trained on Blackwell in early 2026. I think the first Blackwell model will come from xAI, and the reason for that is just according to Jensen, no one builds data centers faster than Elon. Jensen has said this on the record.
And even once you have the Blackwells, it takes six to nine months to get them performing at the level of Hopper, because Hopper is finely tuned. Everybody knows how to use it. The software is perfect for it. The engineers know all its quirks. Everybody knows how to architect a Hopper data center at this point. And, by the way, when Hopper came out, it took six to 12 months for it to really out-perform MPIR, which was the generation before.
So, if you’re Jensen or Nvidia, you need to get as many GPUs deployed in one data center as fast as possible in a coherent cluster, so, you can work out the bugs. And so, this is what xAI, effectively, does for Nvidia, because they build the data centers the fastest, they can deploy Blackwells at scale the fastest, and they can help work with Nvidia to work out the bugs for everyone else.
So, because they’re the fastest, they’ll have the first Blackwell model. We know that scaling laws for pre-training are intact, and this means the Blackwell models are going to be amazing. Blackwell is ... It’s not an F-35 versus an F-4 Phantom, but from my perspective it is a better chip. Maybe it’s, like, an F-35 versus a Rafale. And so, now that we know pre-scaling laws are holding we know that these Blackwell models are going to be really good. Based on the raw specs, they should probably be better.
Then something even more important happens. So, the GB200 was really hard to get it going. The GB300 is a great chip. It is drop-in compatible in every way with those GB200 racks. Now you’re not going to replace the GB200s.
Patrick
Yeah. No new power laws. Yeah.Gavin
Yeah. Just any data center that can handle those, you can slot in the GB300s, and now everybody is good at making those racks. You know how to get the heat out. You know how to cool them. You’re going to put those GB300s in, and then the companies that use the GB300s, they’re going to be the low-cost producer of tokens, particularly, if you’re vertically integrated. If you’re paying a margin to someone else to make those tokens, you’re probably not going to be.This has pretty profound implications. I think it has to change Google’s strategic calculus. If you have a decisive cost advantage, and you’re Google, and you have search and all these other businesses, why not run AI at a negative 30% margin? It is, by far, the rational decision, take the economic oxygen out of the environment. You eventually make it hard for your competitors, who need funding, unlike you, to raise the capital they need.
And then on the other side of that, maybe you have an extremely dominant share position. Well, all that calculus changes once Google is no longer the low-cost producer, which I think will be the case. The Blackwells are not being used for training. And then when that model is trained, you start shifting Blackwell clusters over to Inference.
And then all these cost calculations and these dynamics change. It’s very interesting. Like, during the strategic and economic calculations between the players, I’ve never seen anything like it. Everyone understands their position on the board, what the prize is, what play their opponents are running, and it’s really interesting to watch.
If Google changes its behavior, because it’s going to be really painful for them as a higher-cost producer to run that negative 30% margin, it might start to impact their stock, that has pretty profound implications for the economics of AI. And then when Rubin comes out, the gap is going to expand significantly.
There are two critical themes introduced here:
We have not seen significant jumps in model performance on the base layer, rather improvements have come through reasoning capabilities (running multiple chains of thought, better scaffolding like Claude Code) because we haven’t really leveraged the scaling power laws properly in the last year. This was due to most training runs being done on old architecture, which will change now as Blackwell rollout has gained critical mass.
Google has been benefiting from the situation by being able to scale Gemini on their own hardware and deliver an actually improved base model. They kept inference cost quite low for most of that period on purpose, which was a way to starve out the competition who were doing massive CAPEX investments to deploy Blackwell infrastructure. This structural advantage is going to change.
Before that structural shift occurs, however, Google will do anything it can to monetize the asymmetrical opportunity.
This is a pretty aggressive play. The real-life cost ratio might look a bit different, but in practice it’s a near instant model that performs at frontier level and is significantly cheaper than both GPT 5.2 and Opus 4.5. It’s also a demonstration of the potential gains we might see if the other models start training on modern architecture.
Patrick
If I were to zoom all the way out on this stuff, I find these details unbelievably interesting, and it’s, like, the grandest game that’s ever been played. It’s so crazy and so fun to follow sometimes I forget to zoom out and say, “Well, so what?” Like, okay, so, project this forward three generations past Rubin, or whatever, what is the global human dividend of all this crazy development where we keep making the loss lower on these pre-training scaling models like, “who cares?”It’s been a while since I’ve asked this thing something that I wasn’t blown away by the answer for me personally. What are the next couple of things that all this crazy infrastructure war allows us to unlock because they’re so successful?
Gavin
If I were to posit like an event path, I think the Blackwell models are going to be amazing. The dramatic reduction in per token cost enabled by the TP300, and probably more the MI450 than the MI355, will lead to these models being allowed to think for much longer, which means they’re going to be able to do new things.I was very impressed, Gemini 3 made me a restaurant reservation. This is the first time it’s done something for me. Other than go research something and teach me stuff. If you can make a restaurant reservation, you’re not that far from being able to make a hotel reservation, and an airplane reservation, and order me an Uber, and-
Patrick
All of a sudden, you’ve got an assistant.Gavin
Yeah. And you can just imagine ... Everybody talks about that, but you can just imagine it’s on your phone. I think that’s pretty near-term, but some big companies that are very tech-forward, 50%-plus of customer support is already done by AI, and that’s a $400 billion industry. Then what AI is great about is persuasion. That’s sales and customer support. Of the functions of a company, if you think about them, they’re to make stuff, sell stuff, and then support the customers. Right now, maybe in late ‘26, you’re going to be pretty good at two of them.I do think it’s going to have a big impact on media. I think robotics, like we talked about the last time, are going to finally start to be real. There’s an explosion in exciting robotic startups. I do still think that the main battle’s going to be between Tesla’s Optimus and the Chinese, because it’s easy to make prototypes, it’s hard to mass-produce them.
Then it goes back to that, what Andrej Karpathy said, about AI can automate anything that can be verified. Any function where there’s a right or wrong answer or right or wrong outcome, you can apply reinforcement learning and make the AI really good at that.
Patrick
What are your favorite examples of that so far, or theoretically?Gavin
I mean, just does the model balance. They’ll be really good at making models. Do all the books globally reconcile? They’ll be really good at accounting. Double-entry book keeping, it has to balance. There’s a verifiable. You got it right or wrong. Support or sale. Did you make the sale or not? That’s just like AlphaGo. Did you win or did you lose? Did the guy convert or not? Did the customer ask for an escalation during customer support or not? Its most important functions are important because they can be verified.I think if all of this starts to happen and starts to happen in ‘26, there’ll be an ROI on Blackwell and then all this will continue, and then we’ll have Rubin, and then that’ll be another big quantum of spend, Rubin and the MI450 and the TPU v9. Then I do think just the most interesting question is what are the economic returns to artificial superintelligence? Because all of these companies in this great game, they’ve been in a prisoner’s dilemma. They’re terrified that if they slow down ...
Patrick
Gone forever.Gavin
... and their competitors don’t, it’s an existential risk. Microsoft blinked for like six weeks earlier this year.
I think they would say they regret that. With Blackwell, and for sure with Rubin, the economics are going to dominate the prisoner’s dilemma from a decision-making and spending perspective, just because the numbers are so big. This goes to the ROI on AI question. The ROI on AI has empirically, factually, unambiguously, been positive.I just always find it strange that there’s any debate about this, because the largest spender just on GPUs are public companies. They report something called audited quarterly financials, and you can use those things to calculate something called a return on invested capital. If you do that calculation, the ROIC of the big public spenders on GPUs is higher than it was before they ramped spinning.
You could say, “Well, part of that is OpEx savings.” Well, at some level, that is part of what you expect the ROI to be from AI. Then you say, “Well, a lot of it is actually just applying TPUs, moving the big recommender systems that power the advertising and the recommendation systems from CPUs to GPUs, and you’ve had massive efficiency gains. That’s why all the revenue growth at these companies has accelerated.” It’s like, so what? The ROI has been there.
It is interesting. Every big internet company, the people who are responsible for the revenue are intensely annoyed at the amount of GPUs that are being given to the researchers. It’s a very linear equation. “If you give me more GPUs, I will drive more revenue.” “Give me those GPUs. We’ll have more revenue, more gross profit, and then we can spend money.” It’s this constant fight at every company. One of the factors in the prisoner’s dilemma is everybody has this religious belief that we’re going to get to ASI. At the end of the day, what do they all want? Almost all of them want to live forever. They think that ASI is going to help them with that.
Patrick
Right. Good return.Gavin
It’s a good return, but we don’t know. If, as humans, we have pushed the boundaries of physics, biology and chemistry, the natural laws that govern the universe, then maybe the economic returns to ASI aren’t that high.
So with the new server architecture significantly expanding the capacity of the token factories, we will train smarter base models and be able to run complex reasoning much faster. One of the biggest critiques of ChatGPT Pro models is that they take 25+ minutes to get a response back and were essentially only available on app, rather than via API, i.e. the most valuable distribution channel today.
Technically this changed with 5.2, but it hasn’t exactly been a massive rollout. At a time when developers are asking for a balance between reasoning capability and tokens per second, a massively expensive model with extremely slow response times is not going to see quick adoption.
If we can get the capability at one-third the price and much faster inference, though…
Gavin
We’re going to see, I think, something like that in every vertical, and that’s AI being used for the most core function of any company, which is designing the product. Then it will be ... there’s already lots of examples of AI being used to help manufacture the product and distribute it more efficiently, whether it’s optimizing a supply chain, having a vision system watch a production line. A lot of stuff is happening.The other thing I think is really interesting in this whole ROI part is Fortune 500 companies are always the last to adopt a new technology. They’re conservative. They have lots of regulations, lots of lawyers. Startups are always the first. Let’s think about the cloud, which was the last truly transformative new technology for enterprises. Being able to have all of your compute in the cloud and use SaaS, so it’s always upgraded, it’s always great, et cetera, et cetera, you can get it on every device.
I think that first AWS re:Invent, I think it was in 2013. By 2014, every startup on Planet Earth ran on the cloud. The idea that you would buy your own server and storage box and router was ridiculous. That probably happened even earlier. That had probably already happened before the first re:Invent. The first big Fortune 500 companies started to standardize on it like maybe five years later. You see that with AI. I’m sure you’ve seen this in your startups.
I think one reason VCs are more broadly bullish on AI than public market investors is VCs see very real productivity gains. There’s all these charts that for a given level of revenue, a company today has significantly lower employees than a company of two years ago. The reason is AI is doing a lot of the sales, the support, and helping to make the product.
I mean, ICONIQ has some charts, a16z ... by the way, David George is a good friend, great guy ... he has this modelbusters thing. There’s very clear data that this is happening, so people who have a lens into the world of ventures see this. I do think it was very important in the third quarter. This is the first quarter where we had Fortune 500 companies outside of the tech industry give specific quantitative examples of AI-driven uplift. C.H. Robinson went up something like 20% on earnings. Should I tell people what C.H. Robinson does?
Patrick
Yeah.Gavin
Let’s just say a truck goes from Chicago to Denver and then the trucker lives in Chicago, so it’s going to go back from Denver to Chicago. There’s an empty load, and C.H. Robinson has all these relationships with these truckers and trucking companies, and they match shippers, demand, with that empty load supply to make the trucking more efficient. They’re a freight forwarder. There’s actually lots of companies like this, but they’re the biggest and most dominant.One of the most important things they do is they quote price and availability. Somebody, a customer, calls them up and says, “Hey, I urgently need three 18-wheelers from Chicago to Denver.” In the past, they said it would take them 15 to 45 minutes, and they only quoted 60% of inbound requests. With AI, they’re quoting 100% and doing it in seconds. They printed a great quarter and the stock went up 20%, and it was because of AI-driven productivity that’s impacting the revenue line, the cost line, everything.
I was actually very worried about the idea that we might have this Blackwell ROI air gap because we’re spending so much money on Blackwell. Those Blackwells are being used for training, and there is no ROI on training. Training is you’re making the model. The ROI comes from inference. I was really worried that we’re going to have maybe this three-quarter period where the CapEx is unimaginably high. Those Blackwells are only being used for training.
Patrick
Right, R is staying flat, I’s going up.Gavin
Yeah. Yeah, exactly, so ROIC goes down, and you could see Meta. Meta, they printed ... because Meta has not been able to make a frontier model ... Meta printed a quarter where ROIC declined, and that was not good for the stock. I was really worried about that. I do think that those data points are important in terms of suggesting that maybe we’ll be able to navigate this potential air gap in ROIC.Patrick
Yeah. It makes me wonder about, in this market, unlike everybody else, it’s the 10 companies at the top that are all the market cap, more than all of the attention. There’s 490 other companies in the S&P 500. You studied those too. What do you think about that group? What is interesting to you about the group that now nobody seems to talk about and no one really seems to care about, because they haven’t driven returns and they’re a smaller percent of the overall index?Gavin
I think that people are going to start to care if you have more and more companies print these C.H. Robinson-like quarters. I think the companies that have historically been really well run, the reason they have a long track record of success, you cannot succeed without using technology well. If you have a internal culture of experimentation and innovation, I think you will do well with AI. I would bet on the best investment banks to be earlier and better adopters of AI than maybe some of the trailing banks. Just sometimes past is prologue, and I think it’s likely to be in this case.One strong opinion I have, all these VCs are setting up these holding companies and we’re going to use AI to make traditional businesses better. They’re really smart VCs and great track records, but that’s what private equity’s been doing for 50 years. You’re just not going to beat private equity at their game.
Patrick
This is what Vista did in the early days, right?Gavin
Yeah. I do think this is actually private equities maybe had a little bit of a tough run. Just multiples have gone up. Now private assets are more expensive. The cost of financing has gone up. It’s tough to take a company public, because the public valuation is 30% lower than the private valuation, so PE’s had a tough run. I actually think these private equity firms are going to be pretty good at systematically applying AI.
This is very directionally correct. One of the most amusing/concerning observations from my day-to-day work with large companies trying to use AI is that the average level of competence and vision is…low.
Big companies are by and large significantly behind in meaningful adoption at scale. The situations where this looks a bit different are when they’re working closely with either a dedicated technical team from a hyperscaler or when a strong ISV’s regional business unit is putting significant talent on the ground to drive adoption. This is why consumption has become the most important structural change in the industry’s business model (tying financial incentives on both sides to material adoption).
Gavin
Complexity, and keeping the GPUs running at a high utilization rate in a big cluster, it’s actually really hard. There are wild variations in how well companies run GPUs. If the most anybody ... because the laws of physics, maybe you can get two or three hundred thousand Blackwells coherent. We’ll see. If you have 30% uptime on that cluster and you’re competing with somebody who has 90% uptime, you’re not even competing. One, there’s a huge spectrum in how well people run GPUs. Two, then I think there is ... these AI researchers, they like to talk about taste. I find it very funny. “Why do you make so much money?” “I have very good taste.” What taste means is you have a good intuitive sense for the experiments to perform.This is why you pay people a lot of money, because it actually turns out that as these models get bigger, you can no longer run an experiment on a thousand-GPU cluster and replicate it on 100,000 GPUs. You need to run that experiment on 50,000 GPUs and maybe it takes days, so there’s a very high opportunity cost. You have to have a really good team that can make the right decisions about which experiments to run on this, and then you need to do all the reinforcement learning during post-training well and the test-time compute well. It’s really hard to do and everybody thinks it’s easy, but all those things.
I used to have this saying like ... and I was retelling this long ago ... pick any vertical in America. If you can just run 1,000 stores in 50 states and have them clean, well-lit, stocked with relevant goods at good prices, and staffed by friendly employees who are not stealing from you, you’re going to be a $20 billion company, a $30 billion company. Like 15 companies have been able to do that. It’s really hard, and it’s the same thing. Doing all of these things well is really hard.
Then reasoning, with this flywheel, this is beginning to create barriers to entry. What’s even more important, every one of those labs ... xAI, Gemini, OpenAI and Anthropic ... they have a more advanced checkpoint internally of the model. Checkpoint is just you’re continuously working on these models and then you release a checkpoint, and then the reason these models get fast ...
Patrick
Yeah, the one they’re using internally is further along.Gavin
Is better, and they’re using that model to train the next model. If you do not have that latest checkpoint, it’s getting really hard to catch up. Chinese open source-Patrick
Talk about that.Gavin
Is a gift from God to Meta because you can use Chinese open source, that can be your checkpoint and you can use that as a way to kind of bootstrap this. And that’s what I’m sure they’re trying to do on everybody else. The big problem and the big giant swing factor, I think China’s made a terrible mistake with this rare earth thing. So China, because they have Huawei Ascend, and it’s a decent chip versus something like the deprecated Hoppers they’re based on, it looks okay.And so they’re trying to force Chinese open source to use their Chinese chips, their domestically designed chips. The problem is Blackwell’s going to come out now and the gap between these American frontier labs and Chinese open source is going to blow out because of Blackwell.
And actually DeepSeek in their most recent technical paper V3.2 said, “One of the reasons we struggled to compete with the American frontier labs is we don’t have enough compute.” That was their very politically correct, still a little bit risky way of saying, because China said, “We don’t want the Blackwells.” And they’re saying-
Patrick
“Would you please give us some Blackwells?”Gavin
“That might be a big mistake.” So if you just kind of play this out, these four American labs are going to start to widen their gap for Chinese open source, which then makes it harder for anyone else to catch up because that gap is growing, so you can’t use Chinese open source to bootstrap. And then geopolitically, China thought they had the leverage. They’re going to realize, “Oh, whoopsie-daisy, we do need the Blackwells.” And unfortunately, the problem for them, they’ll probably realize that in late ‘26. And at that point, there’s an enormous effort underway DARPA has, there’s all sorts of really cool DARPA and DOD programs to incentivize really clever technological solutions for rare earths.And then there’s a lot of rare earth deposits in countries that are very friendly to America that don’t mind actually refining it in the traditional way. So I think rare earths are going to be solved way faster than anyone thinks. They’re obviously not that rare. They’re just misnamed. They’re rare because they’re really messy to refine. And so geopolitically, I actually think Blackwell’s pretty significant and it’s going to give America a lot of leverage as this gap widens.
And then in the context of all of that, going back to the dynamics between these companies, xAI will be out with the first Blackwell model and then they’ll be the first ones probably using Blackwell for inference at scale. And I think that’s an important moment for them. And by the way, it is funny, if you go on OpenRouter, you can just look. They have dominant share. Now OpenRouter is whatever it is. It’s 1% of API tokens, but it’s an indication. They processed 1.35 trillion tokens. Google did like eight or 900 billion.
This is like whatever it is last seven days or last month. Anthropic was at 700 billion. xAI is doing really, really well. And the model is fantastic. I highly recommend it. But you’ll see xAI come out with this. OpenAI will come out faster. OpenAI’s issue that they’re trying to solve with Stargate is because they pay a margin to people for compute. And maybe the people who run their compute are not the best at running GPUs. They’re a high cost producer of tokens and I think this kind of explains a lot of their-
Patrick
Code red recently.Gavin
Yeah. Well, the $1.4 trillion in spending commitments. And I think that was just like, hey, they know they’re going to need to raise a lot of money, particularly if Google keeps its current strategy of sucking the economic oxygen out of the room. And you go from 1.4 trillion rough vibes, code red pretty fast. And the reason they have code red is because of all of these dynamics. So then they’ll come out with a model, but they will not have fixed their per token cost disadvantage yet relative to both xAI and Google and Anthropic at that point.Anthropic is a good company. They’re burning dramatically less cash than OpenAI and growing faster. So I think you have to give Anthropic a lot of credit. And a lot of that is their relationship with Google and Amazon for the TPUs and the Trainiums. And so Anthropic has been able to benefit from the same dynamics that Google has. I think it’s very indicative in this great game of chess. You can look at Dario and Jensen maybe have taken a few public comments that were between them.
Patrick
Jousting.Gavin
A little bit of jousting. Well, Anthropic just signed the $5 billion deal with Nvidia. That is because Dario is a smart man and he understands these dynamics about Blackwell and Rubin relative to TPU. So Nvidia now goes from having two fighters, xAI and OpenAI, to three fighters. That helps in this Nvidia versus Google battle. And then if Meta can catch up, that’s really important. I am sure Nvidia is doing whatever they can to help Meta. “You’re running those GPUs this way, maybe we should twist the screw this way or turn the dial that way.” And then it will be also, if Blackwell comes back to China, which it seems like it’ll probably happen, that will also be very good because then Chinese open source will be back.
A lot to unpack here.
Poor/inefficient management of their compute resources is a rather obvious risk across all frontier labs. It’s been very unusual for AI researchers to have access to this level of compute, and with the recent expansion of capacity across all big players, most teams are essentially sitting on more computing power than they have the skills to manage. Anthropic was the main lab to keep its team together over the last year, which likely has had an outsized effect on domain knowledge at cluster scaling. xAI is a dark horse in the race due to a hand-picked small team of obsessive engineers that will figure it out (with a forceful push from Elon).
While the Chinese frontier labs continue to introduce interesting ideas into the mainstream, it’s clear that no single frontier model has been produced that competes head-to-head with the big 3 across a variety of use cases. This gap is more likely to widen over the next year.
Patrick
Yeah, let’s talk about it. What do you think is going to happen?Gavin
Well, I think that application SaaS companies are making the exact same mistake that brick and mortar retailers did with e-commerce. So brick and mortar retailers, particularly after the telecom bubble crashed, they looked at Amazon and they said, “Oh, it’s losing money. E-commerce is going to be a low margin business.How can it ever be more efficient as a business? Right now, our customers pay to transport themselves to the store and then they pay to transport the goods home. How can it ever be more efficient if we’re sending shipments out to individual customers?” Amazon’s vision, of course, “Well, eventually we’re just going to go down a street and drop off a package at every house.” And so they did not invest in e-commerce. They clearly saw customer demand for it, but they did not like the margin structure of e-commerce.
And that is the fundamental reason that essentially every brick and mortar retailer was really slow to invest in e-commerce. And now here we are and Amazon has higher margins in their North American retail business than a lot of retailers that are mass market retailers.
So margins can change. And if there’s a fundamental transformative kind of new technology that customers are demanding, it’s always a mistake not to embrace it. And that’s exactly what the SaaS companies are doing. They have their 70, 80, 90% gross margins and they are reluctant to accept AI gross margins. The very nature of AI is a software. You write it once and it’s written very efficiently, and then you can distribute it broadly at very low cost. And that’s why it was a great business. AI is the exact opposite where you have to recompute the answer every time.
And so a good AI company might have gross margins of 40%. The crazy thing is because of those efficiency gains, they’re generating cash way earlier than SaaS companies did historically. But they’re generating cash earlier, not because they have high gross margins, but because they have very few human employees. And it’s just tragic to watch all of these companies. You want to have an agent, it’s never going to succeed if you’re not willing to run it at a sub-35% gross margin because that’s what the AI natives are running it at. Maybe they’re running it at 40.
So if you are trying to preserve an 80% gross margin structure, you are guaranteeing that you will not succeed in AI. Absolute guarantee. And this is so crazy to me because one, we have an existence proof for software investors being willing to tolerate gross margin pressure as long as gross profit dollars are okay. And it’s called the cloud. People don’t remember. But when Adobe converted from on-premise to a SaaS model, not only did their margins implode, their actual revenues imploded too, because you went from charging upfront to charging over a period of years.
Microsoft, it was less dramatic, but Microsoft was a tough stock in the early days of the cloud transition because investors were like, “Oh, my God, you’re an 80% gross margin business.” And the cloud is the 50s and they’re like, “Well, it’s going to be gross profit dollar accretive and probably will improve those margins over time.” Microsoft, they bought GitHub and they use GitHub as a distribution channel for Copilot for coding. And that’s become a giant business, a giant business. Now for sure, it runs at much lower gross margins, but there are so many SaaS companies.
I can’t think of a single application SaaS company that could not be running a successful agent strategy. And they have a giant advantage over these AI natives in that they have a cash-generative business. And I think there is room for someone to be a new kind of activist or constructivist and just go to SaaS companies and say, “Stop being so dumb.” All you have to do is say, “Here are my AI revenues and here are my AI gross margins, and you know it’s real AI because it’s low gross margins. I’m going to show you that.
And here’s a venture competitor over here that’s losing a lot of money. So maybe I’ll actually take my gross margins to zero for a while, but I have this business that the venture funded company doesn’t have.” And this is just such a obvious playbook that you can run Salesforce, ServiceNow, HubSpot, GitLab, Atlassian, all of them could run this.
Patrick
And the way that those companies could or should think about the way to use agents is just to ask the question, “Okay, what are the core functions we do for the customer now? How can we further automate that with agents effectively?”Gavin
If you’re in CRM, well, what do our customers do? They talk to their customers. We’re customer relationship management software and we do some customer support too. So make an agent that can do that and sell that at 10 to 20% and let that agent access all the data you have. Because what’s happening right now, another agent made by someone else is accessing your systems-Patrick
To do this job.Gavin
... pulling the data into their system, and then you’ll eventually be turned off. And it’s just crazy and it’s just because, “Oh, wow, but we want to preserve our 80% gross margins.” This is a life or death decision, and essentially, everyone except Microsoft is failing it. To quote that memo from that Nokia guy long ago, “Their platforms are burning.”Patrick
Burning platform. Yeah.Gavin
Yeah. There’s a really nice platform right over there and you can just hop to it and then you can put out the fire in your platform that’s on fire, and now you got two platforms and it’s great.
We will close with the key point here: the business model of AI is fundamentally opposite to SaaS. Growth is much faster and much higher, driven by token usage and its commercial realization as revenue in a consumption model. The trade off is lower margins and requiring repeated investments to stay competitive. This is why we’ve seen Agentforce and other initiatives stumble to a certain extent, as legacy incumbents try to run AI functionality as cheaply as possible, while AI-native startups offer best-in-class experiences, even if at a premium.
It’s more important than ever to look at companies from many different angles, including how their strategy is adapting to the fast-paced dynamics of AI today.




