June 13, 2026
Superhuman, Not Smaller
The dominant question in every operating conversation right now is how many roles an agent can absorb. I think the framing is wrong, and wrong in a way that leads operators into a mistake they cannot walk back.
The dominant question in every operating conversation right now is how many roles an agent can absorb. I think the framing is wrong.
Replacing your team with agents looks like a cost cut on the board deck. It is a dependency swap, and it runs in one direction.
The framing treats the decision as a straightforward cost trade: agents can do the work of N people, so you replace N people and bank the savings. It is clean, it is legible, and it fits in a board deck. Swapping people for agents does not remove a cost. It changes the kind of cost you carry, and the new kind is worse on the two dimensions that matter most to a company’s survival: control and reversibility.
I wrote earlier this year about where defensible value sits when code gets cheap and about why the prompt is the wrong thing to build a company on. This post turns that lens on a decision operators are making right now inside their own companies: whether to replace the team at all, and what the better path actually looks like.
You trade a cost you control for one you don’t
Payroll is a cost you govern. You set compensation, you decide the pace of hiring, you own the relationship, and the number is reasonably predictable quarter to quarter. Token spend is a cost set by a third party whose incentives are not yours. The unit price, the model deprecation schedule, the rate limits, the terms of service, all of it sits on a roadmap you do not write and cannot see more than a quarter into.
The off-switch is the part most people underweight. When a function is fully agent-run and you stop paying, the work stops that day. There is no two-weeks notice, no knowledge transfer, no degraded-but-functioning fallback while you sort it out. The dependency is binary in a way a human team never is, and binary dependencies on a vendor you don’t control are exactly the kind of exposure operators spend the rest of their time trying to eliminate.
The standard rebuttal is that token costs only fall, so the dependency gets cheaper every quarter. The per-token numbers are real. The Stanford HAI 2025 AI Index documented a 280x cost reduction for GPT-3.5-level performance over roughly two years, from $20 per million tokens in late 2022 to $0.07 by October 2024. Epoch AI found that for models released after January 2024, the median price decline accelerated to around 200x per year in some tiers.
The problem is that per-token price is the wrong unit of analysis for the decision you are actually making. The cost that hits your company is per outcome, not per token, and agentic workloads consume a staggering multiple of tokens per outcome. A 2026 study of agentic coding tasks found they consume on the order of a thousand times more tokens than a single reasoning pass, because every tool call, every retry, and every reasoning trace retransmits a growing context window. Frontier reasoning models compound it further, with OpenAI’s o1 generating roughly eight times the tokens of GPT-4o on identical tests.
You can see the net effect in what companies actually spend. Menlo Ventures tracked enterprise LLM API spend doubling in six months, from $3.5 billion to $8.4 billion across late 2024 into mid-2025, with total enterprise AI spend reaching $37 billion in 2025, more than triple the prior year. Per-token prices collapsed across that exact window. Spend went up anyway, because consumption outran price. “Tokens are getting cheaper, so the dependency is cheap” is a true statement about a metric that no longer corresponds to your invoice.
The supply side points the same direction. At TSMC’s June 2026 shareholder meeting, CEO C.C. Wei said the AI chip shortage will lag demand “for years,” called this year’s demand “insane,” and confirmed advanced-node and packaging capacity is sold out through 2027 while 3nm prices rise. The physical capacity that produces tokens is itself rationed and getting more expensive, which is a hard floor under any assumption that inference drifts toward free.
And even if the price did fall predictably at the workload level, you would still be renting a core operating function from a for-profit company optimizing its own margin against your dependency. That is a strategic posture, not a line item, and it does not improve just because this quarter’s unit price ticked down.
Gutting the team is a one-way door
The second problem is the one that should stop the decision cold. Hiring an agent is reversible, because you can turn it off. Firing a team to make room for one is not. The people leave, and the institutional knowledge leaves with them: the undocumented context, the customer relationships, the reasons a process exists that nobody ever wrote down. When the pricing turns or the capability stalls or the vendor changes terms, you cannot rehire your way back to where you were. The specific people are gone, the team chemistry is gone, and rebuilding it costs more than you saved and takes a year you may not have.
In his 2015 Amazon shareholder letter, Jeff Bezos drew a line between two kinds of decisions. Type 1 decisions are “consequential and irreversible or nearly irreversible, one-way doors,” the ones that demand deliberation because if you walk through and don’t like what you see, you can’t get back. Type 2 decisions are “changeable, reversible, two-way doors,” where the cost of being wrong is low because you can reopen the door and walk back through. His warning was that large organizations tend to apply the slow, heavy Type 1 process to reversible Type 2 decisions, and grind themselves into caution.
The replacement decision is the inverse error, and I think it is the more dangerous one. Gutting a team to stand up an agent is a Type 1 decision, a one-way door, being made with the speed and spreadsheet logic of a Type 2 cost optimization. The framing of “how much headcount can we take out” is exactly what disguises the door as reversible.
What winning actually looks like
The better path is close to the opposite of replacement. Point agents at the mundane layer, the work that never needed a person in the first place: the routing, the data entry, the status-chasing, the mechanical first draft. Then let the team you already have operate a level up.
A team augmented this way gets more capable over time rather than smaller. Each person covers more surface area and makes higher-leverage decisions while the agents absorb the floor of low-judgment work. You keep the institutional knowledge and the relationships and you multiply them instead of trading them away. That direction is reversible, it compounds, and it does not hand a competitor your fallback.
This is the same asymmetry I wrote about in The Operator’s Supercycle, applied to your own org chart. The defensible value was always in the relationships, the regulated workflows, the institutional trust. Replacement throws that away to save on the periphery. Augmentation extends it at a fraction of the prior cost. The cheap thing got cheaper, which is precisely the argument for pointing it at the periphery and protecting the part that was never cheap to begin with.
Where I could be wrong
Some roles genuinely should go to zero. A pure-commodity function with no institutional knowledge and no customer relationship attached to it may be correctly automated end to end, and I am not arguing against ever fully automating a role. The argument is against replacement as the default frame, the reflex that reaches for headcount reduction before it asks what the function actually protects.
The dependency argument also weakens if inference truly commoditizes to near-zero and switching providers becomes as trivial as switching electricity suppliers. I don’t think that is the base case in the next few years, because the lock-in does not live in the model call, it lives in the integration depth and the switching cost built around it. But the probability is not zero, and an operator who has kept genuine multi-vendor portability has a stronger hand than this argument assumes.
The diagnostic
Before you replace a function with an agent, two questions settle most of it. If this agent’s pricing tripled or the model was deprecated next quarter, could you bring the function back in-house in under ninety days? And are you removing a cost you control to take on one you don’t, or are you freeing a team you keep to do work they could never get to before?
The first framing shrinks the company and hands a competitor your fallback. The second compounds it. The operators who come out of this cycle ahead will not be the ones who cut the most, they will be the ones who kept the knowledge and the relationships and pointed the cheap, capable layer at everything underneath them. Don’t shrink the team. Make it superhuman.