Freebie AI is coming to an end
In this article, I'm going to break the bad news for you: The all-you-can-eat AI buffet is closing. In the near future, we'll all get to pay by the pound.
If you've been paying attention over the last few months, you might've noticed an oddly very familiar SaaS story (some people call it enshittification) play out again in AI: acquire users aggressively, hand out generous bundles, achieve market penetration, then quietly (or loudly) tighten limits, reduce included usage, split plans into “premium tiers,” and introduce usage-based billing where there used to be “included value.”
And no, this isn't me being dramatic for clicks (well, not just that). This is basic math catching up with GPU-heavy products that were priced like old-school software licenses.
The short version: AI economics finally hit the pricing page
Let's start with some inconvenient truths: Inference is expensive.
And that means that pretty much everything else is expensive too:
- Context windows are expensive.
- Agentic loops are expensive.
- Integrations and workflows where AI is involved and usage is high are expensive.
You get the idea.
One "request" (which results in tool calls and a large context window) sent to OpenAI's GPT-5 endpoint can cost a few dollars if you're not careful. Consider an agent that works on a problem independently making multiple calls to the model, and a single session can rack up to hundreds of dollars in API costs if it's not designed with cost control in mind.

This hasn't necessarily been a problem for users so far.
For a while now, vendors have been marketing what are essentially "all-inclusive" plans, like Anthropic's Claude Max, Google's AI Plus/Pro/Ultra and Microsoft's M365 Copilot (and I guess GitHub Copilot) that include high quotas of "included" usage - or even "unlimited", capped only by rate limits.
These monthly plans cost anywhere from 3-200 EUR per user per month. And I'd be surprised if the companies offering them WEREN'T losing money on almost all of the users on even the more expensive plans!
In the early days, investors and enterprises treated that cost as essentially customer acquisition expenses. And that makes sense early on. But once investors start expecting margins and enterprises start demanding predictable SLAs, someone has to pay. Spoiler: it's not the vendor.
That is why we can see the “freebie AI” era ending - and the pace of change has been FAST.
The products are still here. It's just the economics that are changing, and we, as the practitioners, have to adapt to that new reality.
What changed lately (and why it matters)
There have been multiple signals from across the market:
- Included usage is being unbundled or reduced
- "Unlimited" plans are getting stricter practical ceilings
- Rate limiting is becoming more aggressive under load
- Third-party wrappers and unofficial clients are exposing just how fragile “it worked yesterday” can be
- "Premium" features are increasingly behind per-seat add-ons or consumption meters
- Anthropic ejects bundled tokens from enterprise plans (The Register, Apr 16, 2026)
- Powering Frontier Transformation with Copilot and Agents (Microsoft 365 Blog, Mar 9, 2026)
- Google now markets multiple paid Gemini bundles (AI Plus, Pro, Ultra) with tiered "More/Higher/Highest" access, task limits, and feature gates rather than one flat experience (Power your everyday with a Google AI plan (Google One)).
None of this is exactly shocking. What is interesting is how quickly it is happening across vendors at roughly the same time.
And in a way, I feel like Anthropic is being the forerunner here: Their Claude Max was a massive enabler in agentic workflows and developer experimentation, and their recent incursions into the enterprise productivity market (with Claude Cowork) have been generally well-received and are certainly threatening incumbents like Microsoft. And their models beat OpenAI's in many developer and productivity scenarios, so they've certainly achieved a level of market penetration.
Anthropic and the end of predictability
One of the most clear examples is the reporting around Anthropic tightening how bundled tokens are handled in enterprise scenarios. The details vary by plan and timeline, but the direction is obvious: more explicit metering, less “quietly included” usage.
If you're building internal tooling that assumes bundled headroom will stay constant, this is where your architecture meeting gets uncomfortable. Features that looked cheap in a pilot can become very expensive at organizational scale. And it's already been the case that subscriptions that are affordable on an individual-user basis (where Anthropic is allowed to use your data for whatever they want) can become unaffordable when you upgrade to a business plan (with data protections) with much more limited included usage.
And yes, this also trickles down to developer tooling ecosystems that route through Claude-compatible backends. If your daily workflow depends on heavy multi-step sessions, this can turn from “smooth” to “quota panic” very quickly.
And these quota "windows" mean that your work day might revolve around bursts of productivity, followed by periods of waiting for limits to reset (or more likely, working on other stuff while waiting for the AI to be available again). This is already happening to some users, and it's a very different experience from the "always on" feel of the freebie era.
https://www.businessinsider.com/ai-usage-limits-causing-some-to-restructure-their-workday-2026-4
OpenClaw usage being thrown out of Claude Pro plans feels like the canary in the coal mine
In case you aren't terminally online (or drinking the Anthropic Kool-Aid for some other reason), you might have missed the recent news about Anthropic banning third-party access to Claude Pro subscriptions.

Gemini and "benefits" becoming conditional
I don't use Gemini or Google's AI plans personally, but following the headlines it sounds like they're also a part of the same trend.
Again: this isn't evil. It's product economics. But from a buyer perspective, the result is the same — fewer truly open-ended benefits, more caveats, and more need to understand exactly what your license does and does not include.
Essentially, when organizations budget AI in 2026 using 2025 assumptions, they get surprised twice:
- Once by direct licensing/add-on increases,
- and again by the indirect cost from throttling, retries, failover behavior, and fallback to pricier models.
The first one you can prepare for. The second one will hit you in the face like a wet fish if you haven't designed for it.
Meanwhile, Microsoft is still the weird outlier
In this cycle, Microsoft currently looks like the outlier. I'm especially talking about M365 Copilot, but to a lesser extent GitHub Copilot too.
Why? Because they are still pushing broad capability availability and bundled value in Microsoft 365 Copilot experiences and rolling with the very generous "Premium Request" model for GitHub Copilot, while all main competitors are visibly tightening the screws. They’re not exactly giving everything away forever (nobody is), but compared with peers, the current commercial posture feels much less consumption-anxious.
Examples:
- Microsoft is still positioning Copilot value as a broad suite play (multi-model access, embedded agents, governance features), not just raw token metering
- In developer tooling discussions, users still call out GitHub Copilot's Premium Request behavior as a key economic constraint to optimize, which highlights why predictability matters even in "bundled" models
And maybe they don't really have any choice, considering how small the Copilot adoption still is, and how Anthropic (and to a growing extent, OpenAI) are aggressively targeting the enterprise market with their own offerings. If Microsoft were to start charging for every token (instead of greatly subsidizing usage), they risk losing marketshare and adoption momentum to competitors.
They've also been loudly positioning multimodel and enterprise controls as a value stack rather than pure token-metering theater. If they can keep that balance (big if), they could keep winning mindshare among organizations that need the predictability and control.
To be fair, Microsoft has done plenty of licensing... Well, let's call them gymnastics in other eras, so nobody should assume this posture is permanent. Anything but!

But right now, compared to what we're seeing elsewhere, they look relatively generous. And talking to other practitioners, it sounds like at least GitHub Copilot's oddly abstract but pretty generous "Premium Request" model is still more predictable and often quite well-liked when compared with token-based metering that is purely consumption-based and becoming common elsewhere.
But will Microsoft's approach hold?
That’s the big question. If they can maintain a more generous and predictable model while competitors tighten, they could solidify their position. But if they eventually shift toward more aggressive metering, it could trigger a broader market realignment.
And the stock market is certainly souring on Microsoft's growing AI investments (or "cash burn" as some pundits like to call it), so they may feel pressure to monetize more aggressively sooner rather than later.
What's next?
The next 6-12 months will be very interesting. We can expect to see more vendors follow the trend of unbundling included usage and introducing more explicit metering.
The investors - the stock market, that is - are watching the AI investments closely, and the pressure to actually make inference economics work is only going to increase. So we can expect to see more and more vendors making moves to capitalize on consumption, and more and more users feeling the pinch of those changes - especially on free or "unlimited" plans that are suddenly not so unlimited anymore.
And that Claude 5-hour quota? I bet we'll get a way out of that in the near future - but you better get your credit card ready! 😉
The uncomfortable truth: we all knew this was coming
For two years, the industry behaved like inference was free. That was always temporary. Now the investors are catching up with the keynote promises.
Remember when a $5 Uber ride got you to the other side of town and the driver actually got paid well? When VC's were ready to pour money into the platform markets, at least for a while customers got highly subsidized rides.
And just like with Uber and Lyft, the initial "freebie" phase was about rapid user acquisition and market penetration. But as the players become more entrenched and the costs become more apparent, the shift to more sustainable pricing models is inevitable.
This doesn't mean AI is "over." It means AI is graduating from "growth-at-all-costs" to a more sustainable, grown-up pricing models. It's just unfortunate for the practitioners that this means paying more!
So… is freebie AI dead?
For serious usage, yes.
You'll still see free tiers, promotional credits, and occasional generous bundles. But the era where heavy, production-like usage could hide inside ambiguous "included" value is ending fast.
If you're still betting your roadmap on that era continuing, I'd strongly recommend revisiting those assumptions while the migration is still your choice, not an emergency.
And if your CFO has started asking "why did this assistant suddenly get expensive," congratulations — you've officially entered the adult phase of AI adoption.
Welcome. It has dashboards - and comes with the requirement for accountability.
References
- Anthropic ejects bundled tokens from enterprise plans (The Register, Apr 16, 2026)
- Anthropic tweaks timed usage limits to discourage Claude demand during peak hours (The Register, Mar 26, 2026)
- Anthropic OpenClaw Claude subscription ban (The Verge, Apr 4, 2026)
- Anthropic: No, absolutely not, you may not use third-party harnesses with Claude subs (The Register, Feb 20, 2026)
- Claude Code cache chaos creates quota complaints (The Register, Apr 13, 2026)
- Power your everyday with a Google AI plan (Google One)
- Powering Frontier Transformation with Copilot and Agents (Microsoft 365 Blog, Mar 9, 2026)
- Copilot Cowork: A New Way of Getting Work Done (Microsoft 365 Blog, Mar 9, 2026)
- AI usage limits are causing some workers to restructure their workday (Business Insider, Apr 2026)
- OpenClaude issue #678: Optimize GitHub Copilot “Premium Request” consumption
- OpenClaude issue #43: Insufficient credits
Comments
No comments yet.