The AI bill that nobody budgeted for

by Luís Rijo
Luís Rijo
Luis Rijo has spent over a decade watching money move through digital markets - where it goes, who follows it, and who gets left behind when the flow changes direction. luis@thespend.net
•
June 6, 2026
•
9 min read

Enterprise AI token bills are spiralling out of control as agentic usage explodes.

A company forgot to set a usage limit. One month later, it had run up a $500 million bill on Claude. Not $500,000. Not $5 million. Half a billion dollars, in a single month, on one AI tool.

That story - reported by a consultant to Axios and circulated widely in recent weeks - is the extreme end of a pattern playing out in finance departments from San Francisco to London. Uber burned through its entire 2026 AI coding budget by April. Microsoft revoked hundreds of developers' access to Claude Code, a tool it had only enabled months before. A Priceline employee reported that a routine Cursor contract renewal came back four to five times more expensive than expected.

The price of AI tokens - the tiny units of computation that large language models charge for - has fallen by roughly 98% since late 2022. Yet enterprise AI bills, according to The Next Web, have risen by an estimated 320% over the same period. Cheaper per unit. Catastrophically more expensive overall. And almost nobody saw it coming.

On Fox Business this week, iCapital chief investment strategist Sonali Basak put it plainly: token costs are rising "astronomically", driven by the surge in agentic AI, and a buyer strike is coming.

The background

For most of the history of software, companies paid for tools the way they pay for office chairs. A set number of seats, a fixed monthly bill, a predictable line on the spreadsheet. Microsoft Office: $10 per user per month. Salesforce: $25 per user. The IT budget process made sense because the costs were stable.

AI broke that model, but slowly enough that most companies did not notice until recently.

The first wave of AI tools played along with the seat-based game. Flat subscriptions, bundled usage, fixed monthly costs. This was not sustainable economics - it was market-share buying. AI labs, flush with venture capital, effectively subsidised consumption to win adoption. The deal seemed good: unlimited AI, flat price, monthly invoice. By 2026, 85% of SaaS providers had shifted to hybrid or consumption-based pricing, according to ICONIQ's State of Go-to-Market survey. The free lunch had ended. Most companies just had not checked their tab yet.

Tokens are the basic unit of computation in a large language model - roughly comparable to chunks of words or characters. Every prompt typed into ChatGPT or Claude, and every response that comes back, costs tokens. A simple question costs a handful. A complex research task costs thousands. The key thing to understand is that a token is not a fixed quantity of work. Depending on what you ask the AI to do, the same prompt can cost ten times more or ten times less in token terms.

For most of 2023 and 2024, this did not matter much. People used AI like a search engine: one question, one answer. Token consumption was small, predictable, manageable. Then came the agents.

Agentic AI refers to systems that do not just answer questions but take multi-step actions: writing code, running tests, updating files, calling APIs, looping back to check results, and doing it all over again until the task is done. Think of the difference between asking a colleague a question and hiring them to independently complete an entire project. Both involve work. Only one of them involves hours and hours of sustained activity - every minute of which, in AI terms, costs money.

What is actually happening

The shift from chatbots to agents is the central economic event in AI right now, and its consequences are only beginning to land on corporate balance sheets.

According to data cited by Derek Thompson in his analysis of the AI cost panic, the typical agent job uses 96,000 tokens before generating an answer - more text than the entire novel The Great Gatsby. A simple linear AI interaction in 2023 cost roughly $0.04. An orchestrated agentic system in 2026 costs approximately $1.20 - about 30 times more. Per-developer consumption at enterprise companies has risen roughly 18.6 times in nine months, according to Jellyfish research head Nicholas Arcolano. The average business is now spending 13 times more on AI tokens than in January 2025, according to Ramp lead economist Ara Khazarian.

At Uber, by March 2026, 84% of developers were classified as agentic coding users, meaning they had delegated entire workflows to AI rather than simply using it for autocomplete suggestions. That is not a company experimenting with AI. That is a company that has rebuilt its engineering culture around an AI tool - and then discovered that the economics of that tool do not scale the way traditional software does.

On the revenue side of this equation, the numbers are stunning. Anthropic's annualised revenue run rate has climbed from roughly $9 billion at the end of 2025 to $30 billion, with more than 1,000 enterprise customers paying over $1 million a year. In the Fox Business segment, Basak noted the core dynamic: token model revenues "are through the roof" but the question is whether growth can hold "when you have companies saying we have to slow down."

Anthropic felt the pressure on both ends. Rising demand strained its computing infrastructure - API uptime over the 90 days ending April 2026 fell to 98.95%, below the 99.99% benchmark that enterprise customers typically expect. The response was a structural pricing overhaul. In November 2025, Anthropic began shifting enterprise customers away from flat $200-per-seat deals toward a model that charges a $20 base fee per user per month, with all usage billed at standard API rates on top. By February 2026, all new enterprise customers were on the usage-based plan. By March, the legacy flat-fee option was no longer available.

GitHub followed the same path. Microsoft's code hosting platform shifted its Copilot AI assistant to usage-based billing from June 1, 2026, replacing flat-rate subscriptions with token-linked charges. One developer reported seeing their projected monthly cost jump from roughly €67 to €966.

The money trail

The token model is not a neutral billing mechanism. It is a deliberate shift in who carries the financial risk - away from AI companies and onto their customers.

Under seat-based pricing, the vendor absorbs the cost of heavy usage. Sell 1,000 seats at $200 each and the revenue is fixed whether people use the AI every hour or never open it. That model worked when the expected usage was moderate and compute costs were falling fast enough to absorb the gaps. When agentic AI arrived and usage multiplied by a factor of ten or twenty, the economics collapsed. The vendor's costs rose faster than the seat fees could cover. Usage-based pricing is the correction.

For AI labs like Anthropic and OpenAI, this is a genuine revenue opportunity. Goldman Sachs forecasts that agentic AI could drive a 24-fold increase in token consumption by 2030, reaching 120 quadrillion tokens per month globally. Per-token prices will keep falling - Gartner estimates inference costs will drop 90% by 2030 - but the volume increase far outpaces the price decline. Revenue goes up. Substantially.

For enterprises, the situation is structurally different. Token costs are variable, non-linear, and nearly impossible to forecast using conventional budget tools. Tokens do not scale like seats. When a company adds a 1,000th employee to a seat licence, the marginal cost is clear and fixed. When it deploys an AI agent into a complex workflow, the token cost depends on how many steps the agent takes, how long the context window is, how often it needs to retry, and dozens of other variables that nobody has modelled before.

Deloitte found that 50% of enterprise finance leaders are now spending 21 to 50% of their entire digital transformation budgets on AI - a figure that would have seemed extraordinary 24 months ago. In 2025, 31% of FinOps practitioners - the specialists who manage cloud spending - were responsible for overseeing AI costs. By 2026, that figure was 98%. An entire discipline of corporate cost management has been retooled in under a year, not because AI prices went up, but because the consumption model made old forecasting methods useless.

The behaviour that emerged from this pricing vacuum has a name: tokenmaxxing - the organisational tendency to route every task through the most capable, most expensive AI model available, with no governance, no routing logic, and no cost visibility. As Basak noted in the Fox Business segment, companies "went crazy" for the AI tools - and "everyone is looking around now like, have you seen the bill for this?"

What people are doing about it

The initial response, at companies large and small, has been to pull back.

Uber's operations chief, Andrew Macdonald, said publicly that token usage did not seem to have a direct correlation with useful consumer features - an unusually frank admission from a company that had become one of the most visible AI adopters in enterprise tech. Microsoft did not wait for a similar reckoning. It revoked Claude Code licences for hundreds of developers, with a migration plan to its internal Copilot CLI tool by June 30.

A market is now forming around the chaos. Factory, an enterprise AI coding startup, this week launched a model router that automatically selects the cheapest adequate AI model for each task - routing simple requests to cheaper models and reserving expensive frontier models only when the task genuinely requires them. Faros AI CEO Vitaly Gordon confirmed that frontier labs are already doing this routing internally, even when customers call a premium model.

The Linux Foundation is launching a new body called the Tokenomics Foundation to bring standardised cost measurement to AI spending - an acknowledgement that the industry does not yet have common definitions for what a token actually costs across different models, providers, and use cases.

Basak's framing from today's Fox Business segment captures the market's position precisely: is token consumption going "to about" - a floor - or heading for "a plateau"? The question is whether the buyer strike is a correction or a ceiling. The answer depends almost entirely on whether enterprises can demonstrate, at scale, that the productivity gains from agentic AI actually justify the invoices. So far, that proof has been scarce.

The bottom line

The token model has produced a paradox: AI is getting cheaper per unit at the exact moment that AI bills are getting larger. The reason is that the shift to autonomous, multi-step agents has multiplied consumption far faster than prices have fallen. Enterprises that built budgets around 2024 assumptions are now managing costs that look more like cloud infrastructure than software subscriptions - variable, volatile, and very hard to forecast. The AI industry is enormously profitable right now. The open question is whether enterprise customers, sitting on a wave of bills with limited proof of return, are going to keep buying what the labs are selling.

Timeline

Late 2022 - GPT-4-class AI models launch at roughly $20 per million tokens; flat-rate seat subscriptions dominate enterprise AI pricing
Early 2025 - Token prices begin falling sharply; enterprises gorge on all-you-can-eat flat subscriptions; agentic AI tools begin rolling out
November 2025 - Anthropic begins transitioning enterprise customers from flat $200 seat fees to a $20 seat plus usage-based billing model
Q4 2025 - Q1 2026 - Claude Code weekly active users double between January 1 and mid-February, according to Anthropic engineering data cited by WorkTech
February 2026 - All new Anthropic enterprise customers move to usage-based billing; legacy flat-fee plan phased out
March 2026 - 84% of Uber's developers classified as agentic coding users; Uber burns through its entire 2026 AI budget by April
April 2026 - Deloitte publishes a CFO guide to AI token economics; Anthropic's annualised revenue hits $30 billion run rate; Anthropic API uptime falls to 98.95%
May 2026 - Per-developer AI token consumption at enterprise companies measured at 18.6 times higher than nine months earlier; Microsoft revokes Claude Code licences; Goldman Sachs forecasts 24-fold token consumption increase by 2030
June 1, 2026 - GitHub shifts Copilot from flat-rate to token-based usage billing; one developer reports projected monthly cost rising from €67 to €966
June 5, 2026 - Linux Foundation announces the Tokenomics Foundation; TechCrunch reports on the industry-wide scramble to manage AI token costs
June 6, 2026 - iCapital chief investment strategist Sonali Basak warns on Fox Business of a coming buyer strike as token costs surge "astronomically"

Summary

Who: Enterprise companies including Uber, Microsoft, and Priceline; AI labs Anthropic, OpenAI, and GitHub; and financial analysts tracking AI economics including iCapital chief investment strategist Sonali Basak.

What: A structural collision between falling per-token prices and exploding token consumption, driven by the shift from simple AI chatbots to autonomous agentic systems that loop through dozens of steps per task. Enterprise AI bills have risen 320% even as per-token costs fell 98%.

When: The crisis crystallised in mid-2026, with the most visible corporate pullbacks - Uber's budget blowout, Microsoft's licence revocation - occurring in May and June. The underlying pricing shift at Anthropic began in November 2025.

Where: Across enterprise technology organisations globally, with the most visible cases concentrated in the US tech sector.

Why: AI labs need revenue to cover the compute infrastructure required to run frontier models at scale. Usage-based pricing transfers the cost risk from vendors to customers at the exact moment that consumption patterns have become too volatile to predict. The result is a billing crisis that may determine whether enterprise AI adoption continues to scale or hits a ceiling.

Luís Rijo

Luis Rijo has spent over a decade watching money move through digital markets - where it goes, who follows it, and who gets left behind when the flow changes direction. luis@thespend.net