Why behind AI: Anthropic progress update
It's been barely 40 days since I published my deep dive on Anthropic, and things have been as eventful as ever.
The company just announced a massive $30B Series G raise (upsized from the $20B target), as well as a $14B ARR run-rate (after finishing the year at $9B).
To call the growth explosive would be a disservice to the rapid scale of adoption that the Anthropic team is seeing in enterprise. Still, we have to note that the last month was not without significant controversies.
They started the year by removing the option of using their premium subscription plans with third-party agentic coding services that are not built on top of Anthropic’s SDK (i.e. they lose telemetry visibility). The biggest hit was on OpenCode, the most widely adopted CLI open-source agentic tool today, with 2.5M active developers. The controversy took over the X timeline, with OpenAI pivoting quickly on the opportunity to allow Codex to be used easily within OpenCode.
Afterwards, Anthropic got itself into more trouble with, ironically, OpenClaw, which at the time was known as ClawdBot. Just as the personal agentic tool was getting massive visibility across tech and Claude was the most popular model used for it, they started banning accounts on the service and sent a cease-and-desist letter to the lead developer. ClawdBot had to be hastily renamed to MoltBot, followed by the final version OpenClaw within 48 hours. Peter, the guy behind OpenClaw, went on a little tour to San Francisco, which ended with, you guessed it, OpenAI pivoting quickly and bringing him in to lead agentic development efforts.
If this was not enough, we had another controversy, this one related to their federal business. The Pentagon reportedly used Claude during the January 2026 Maduro raid via a partnership with Palantir, despite Anthropic’s policies prohibiting violence facilitation, weapons development, or surveillance. This sparked a clash over Anthropic’s $200 million contract, with the company insisting on guardrails against autonomous weapons and mass domestic surveillance, while officials threatened to label it a “supply chain risk” or cut ties. Defense Secretary Pete Hegseth criticized models that “won’t allow you to fight wars,” triggering a review from the Pentagon on the use of Anthropic models.
These three situations highlight the other side of the coin of being a “missionary” organization. Anthropic’s leadership team is opinionated on pretty much everything and does not particularly care about external feedback. Even if developers have been the driving force behind the pivot of the company, Anthropic’s leadership team does not actually care about developer culture or viewpoints. The opposition to how Claude models were used by the federal government is also not surprising if you account for the EA (effective altruism) roots of Anthropic vs. the e/acc technical leadership in control of the federal government today.
In this article, I’ll dive deeper into the implications from Dario’s recent interview with Dwarkesh.
Dario Amodei: All the cleverness, all the techniques, all the kind of we need a new method to do something like that doesn't matter very much. There are only a few things that matter.
One is like how much raw compute you have.
The other is the quantity of data that you have.
Then the third is kind of the quality and distribution of data, right? It needs to be a broad, broad distribution of data.
The fourth is, I think, how long you train for.
The fifth is you need an objective function that can scale to the moon.
One of the big discussions in AI research is whether LLMs are actually the right method of scaling towards AGI and how the existing methods help achieve this goal. There are those like Ilya who advocate for “novel research,” i.e. they see the next breakthrough as driven by a clever shortcut that a talented researcher will be able to figure out, given the time and opportunity. On the other side, you have researchers like Dario who have very much productized several pillars that have been helping scale models in the last eight years since he wrote a whitepaper on it. The core ingredients are compute, data quantity and quality, training duration, self-improvement, scalable objectives, normalization, and conditioning. All of the recent innovation fits within these categories (pre-training, reinforcement learning, and chain-of-thought workflows).
He is basically saying that what they are doing works and there is no real evidence right now that an alternative approach will yield an improved result.
Dario Amodei: We're seeing a pre-training phase and then we're seeing like an RL phase on top of that. And with RL, it's actually just the same. Like, you know, even other companies have published, like, you know, in some of their releases have published things that say, look, you know, we train the model on math contests, you know, AIME or the kind of other things. And, you know, how well the model does is log linear and how long we've trained it. And we see that as well. And it's not just math contests. It's a wide variety of RL tasks. And so we're seeing the same scaling in RL that we saw for pre-training.
If you zoom in on reinforcement learning, what’s becoming clear is that rather than having hit a wall, it appears to be scaling with compute. This means that right now the labs are able to train the models further on specific tasks, which opens up more opportunities for automation, as demonstrated with agentic coding.
Dario Amodei: I would put the spectrum as 90% of code is written by the model, 100% of code is written by the model. And that’s a big difference in productivity. 90% of the end-to-end SWE tasks, right, including things like compiling, including things like setting up clusters and environments, testing features, writing memos, 90% of the SWE tasks are written by the models. 100% of today’s SWE tasks are written by the models. And even when that happens, it doesn’t mean software engineers are out of a job. Like there’s like new higher level things they can do where they can manage.”
On current productivity uplift:
“I say right now the coding models give maybe, I don’t know, a 15, maybe 20% total factor speed up. That’s my view. And six months ago, it was maybe 5%.”
One of the most contentious discussions in developer circles is around “AI will write all of the code” being strongly associated with “software engineering is dead.” Every time Dario mentions his predictions around this, short clips of him saying a percentage followed by “all code will be written by AI” start to circulate.
His argument is that nobody is properly listening to what his actual prediction is, which is that agentic coding is taking over not all tasks, but that certain tasks are done 100% by the models. One way to think of it is that software engineering is a spectrum, and just because AI has taken over parts of it does not immediately mean that there is no value in having software engineers going forward. The practical boost he is seeing in terms of productivity is in the 15%-20% range.
Dario Amodei: On the basic hypothesis of, you know, as you put it, within 10 years, we’ll get to, you know, you know, what I call kind of country of geniuses in a data center. I’m at like 90% on that. And it’s hard to go much higher than 90% because the world is so unpredictable.”
“And then I have a hunch, this is more like a 50, 50 thing that it’s going to be more like one to two, maybe more like one to three.”
On what remains uncertain:
“My one little bit, the one little bit of fundamental uncertainty, even on long timescales, is this thing about tasks that aren’t verifiable, like planning a mission to Mars, like, you know, doing some fundamental scientific discovery like CRISPR, like, you know, writing a novel. Hard to verify those tasks.”
The "country of geniuses" line comes from this section of "Machines of Loving Grace":
By powerful AI, I have in mind an AI model—likely similar to today’s LLMs in form, though it might be based on a different architecture, might involve several interacting models, and might be trained differently—with the following properties:
In terms of pure intelligence4, it is smarter than a Nobel Prize winner across most relevant fields – biology, programming, math, engineering, writing, etc. This means it can prove unsolved mathematical theorems, write extremely good novels, write difficult codebases from scratch, etc.
In addition to just being a “smart thing you talk to”, it has all the “interfaces” available to a human working virtually, including text, audio, video, mouse and keyboard control, and internet access. It can engage in any actions, communications, or remote operations enabled by this interface, including taking actions on the internet, taking or giving directions to humans, ordering materials, directing experiments, watching videos, making videos, and so on. It does all of these tasks with, again, a skill exceeding that of the most capable humans in the world.
It does not just passively answer questions; instead, it can be given tasks that take hours, days, or weeks to complete, and then goes off and does those tasks autonomously, in the way a smart employee would, asking for clarification as necessary.
It does not have a physical embodiment (other than living on a computer screen), but it can control existing physical tools, robots, or laboratory equipment through a computer; in theory it could even design robots or equipment for itself to use.
The resources used to train the model can be repurposed to run millions of instances of it (this matches projected cluster sizes by ~2027), and the model can absorb information and generate actions at roughly 10x-100x human speed5
. It may however be limited by the response time of the physical world or of software it interacts with.
Each of these million copies can act independently on unrelated tasks, or if needed can all work together in the same way humans would collaborate, perhaps with different subpopulations fine-tuned to be especially good at particular tasks.
We could summarize this as a “country of geniuses in a datacenter”.
Like most of the consensus in San Francisco right now, he indicates that he expects AGI within ten years, but he has a feeling that it can happen within three. What he is not yet convinced about is how AI will handle novel complex tasks that cannot possibly be verified upfront versus existing trainable activities.
Dario Amodei: I really do believe that we could have models that are a country of geniuses, a country of geniuses in the data center in one to two years. One question is how many years after that do the trillions in, you know, do the trillions in revenue start rolling in? I don’t think it’s guaranteed that it’s going to be immediate. You know, I think it could be one year. It could be two years. I could even stretch it to five years, although I’m skeptical of that.
On what diffusion actually looks like:
“Big enterprises, like, you know, big financial companies, big pharmaceutical companies, all of them, they’re adopting Claude Code much faster than enterprises typically adopt new technology, right? But again, it takes time. Like any given feature or any given product, like Cloud Code or like Cowork, will get adopted by the individual developers who are on Twitter all the time, by the like Series A startups, many months faster than they will get adopted by like, you know, a large enterprise that does food sales.”
He is hedging his bets on the balance between "we have AGI within three to ten years" and "AGI actually moves the needle in the real economy." From his point of view, capability improvement and adoption are scaling very fast, but adoption is lagging compared to capability. Still, enterprise adoption has never been this high for a new technology on the market. Companies that can navigate the institutional lag of security, compliance, and provisioning of compute are best positioned to win in this new environment.
Dario Amodei: I actually think profitability happens when you underestimated the amount of demand you were going to get and loss happens when you overestimated the amount of demand you were going to get because you’re buying the data centers ahead of time.
If every year we predict exactly what the demand is going to be, we’ll be profitable every year because spending 50% of your compute on research roughly, plus a gross margin that’s higher than 50% and correct demand prediction leads to profit.
On the revenue trajectory:
“In 2023, it was like zero to 100 million. 2024, it was 100 million to a billion. 2025, it was a billion to like nine or 10 billion. And then... we added another few billion to revenue in January.”
The challenge of running his business is that if you are conservative on the demand curve, you'll miss out on opportunities, and if you are aggressive, you'll blow up your company. In more detail:
Dario Amodei: If you want to serve long context, you have to like store your entire KV cache... it’s difficult to store all the memory in the GPUs to juggle the memory around.”
On the industry-wide buildout:
“If you talk about the industry, like the amount of compute the industry is building this year is probably in the, I don’t know, very low tens of, call it 10, 15 gigawatts. Next year, it goes up by roughly three X a year. So like next year’s 30 or 40 gigawatts and, 2028 might be a hundred, 2029 might be like three, 300 gigawatts. And like each gigawatt costs like, maybe $10, I mean, I’m doing the math in my head, but each gigawatt costs maybe $10 billion, you know, order $10 to $15 billion a year.”
On the risk of buying too much)
“I could buy a trillion dollars, actually, it would be like $5 trillion of compute because it would be a trillion dollar a year for five years, right? I could buy a trillion dollars of compute that starts at the end of 2027. And if my revenue is not a trillion dollars, if it’s even 800 billion, there’s no force on earth. There’s no hedge on earth that could stop me from going bankrupt if I buy that much compute.”
On competitors:
“I kind of get the impression that, you know, some of the other companies have not written down the spreadsheet, that they don’t really understand the risks they’re taking. They’re just kind of doing stuff because it sounds cool.”
We currently have around 10 GW of compute deployed for AI training and inference, scaled over the last few years. This year we will likely double that, the year after will add another 30 GW, and the year after another 90 GW.
In his view, the game that OpenAI is trying to play of cornering most of that buildout is insanely risky and will blow them up.
Dario Amodei: You do get industries in which there are a small number of players. Not one, but a small number of players. And ordinarily, like the way you get monopolies like Facebook or Meta, I always call them Facebook, but is these kind of network effects. The way you get industries in which there are a small number of players are very high costs of entry, right? So, you know, cloud is like this. I think cloud is a good example of this. You have three, maybe four players within cloud. I think that’s the same for AI, three, maybe four.”
Cloud is very undifferentiated. Models are more differentiated than cloud, right? Like, everyone knows Claude is good at different things than GPT is good at, than Gemini is good at. And it’s not just Claude’s good at coding, GPT is good at, you know, math and reasoning, you know. It’s more subtle than that. Like, models are good at different types of coding. Models have different styles.
He thinks that the business of training and serving models is on a long-term scale more interesting and diversified than cloud because the models have their own specific strengths and "personalities," which influence decision makers more than "which infra is the best fit for me."
Dario Amodei: Around the beginning of 2025, I said, I think the time has come where you can have non-trivial acceleration of your own research if you’re an AI company by using these models. And, of course, you know, you need an interface. You need a harness to use them. And so I encouraged people internally, you know, I didn’t say this is one thing that, you know, that you have to use. I just said people should experiment with this. And then, you know, this thing, I think it might’ve been originally called Claude CLI, and then the name eventually got changed to Claude Code internally, was the thing that kind of everyone was using.
Just the fact that we ourselves are kind of developing the model and we ourselves know what we most need to use the model, I think it’s kind of creating this feedback loop.
On why coding and not pharma:
“This is the reason why we launched a coding model and, you know, didn’t launch a pharmaceutical company, right? You know, my background’s in biology, but we don’t have any of the resources that are needed to launch a pharmaceutical company.”
In his mind, the reason why Claude Code has been such an outstanding success is because the frontier labs are best positioned to iterate on feedback loops and improve the product vs. wrappers. This does not exactly map to their actual strategy (as Anthropic is also building out custom tooling for finance and pharma use cases which they obviously will not use themselves), but it's a fair point.
Dario Amodei: I actually do think that the API model is more durable than many people think. One way I think about it is if the technology is kind of advancing quickly, if it’s advancing exponentially, what that means is there’s always kind of like a surface area of kind of new use cases that have been developed in the last three months. And any kind of product surface you put in place is always at risk of sort of becoming irrelevant, right?
I kind of actually predict that we are, it’s going to exist alongside other models, but we’re always going to have the API business model because there’s always going to be a need for a thousand different people to try experimenting with the model in a different way.
On future pricing models:
“The model goes to, you know, one of the pharmaceutical companies and it says, oh, you know, this molecule you’re developing, you should take the aromatic ring from that end of the molecule and put it on that end of the molecule. And, you know, if you do that, wonderful things will happen. Like those tokens could be worth, you know, tens of millions of dollars.”
I'll close off with the most difficult argument he is making in the context of selling software. In his view, value will continue to accrue at the bottom (which has been a thesis of Infra Play), and vertical applications will only survive when their performance correlates to leveraging the latest innovation coming from the model layer via API. Trying to completely separate yourself from the improvement in the models is likely going to kill your business, not create a moat.
An adjacent thesis to this is the following:
If you’re building products on top of models, you already know the feeling: the clever feature you shipped in March gets commoditized by a model update in June. The ground moves every quarter and your moat evaporates.
The tradeoff here is between chasing what’s exciting and building what’s durable. The founders who are thriving right now stopped caring about model capabilities and started caring about the things models can’t take away: data moats, workflow capture, integration depth. It’s less fun to talk about at a dinner party. It’s where the actual companies get built.
The people making the sharpest moves in this world are the ones who got excited about plumbing. Not the demo, not the pitch, not the capability. The ugly, boring infrastructure that makes a product sticky independent of which model sits underneath it.
I think that from my point of view, this still backs up the case for cloud infrastructure software (“the plumbing of AI”) as the most durable software business that will remain in the age of AI.


