Claude Code & Cost of AI Coding

As everyone is falling in love with Claude Code (or similar tools)…

News this morning was that Anthropic has tested removing Claude Code from the Pro subscription. So far they tested it on 2% of users, caused an uproar and they clarified for now but also are clear about intentions.

The reality is that the subscription pricing was built for light to medium chatbot use, not running agents writing code for long periods. And they’ve seen explosive growth in recent weeks (mirroring the trend here, where everyone is coding suddenly).

So we will get our first taste of paying for AI what it actually costs, and will cost long-term. A session with Claude Code can easily save me hours of coding and research, and for that to be included in a $20/mo subscription is an unfair comparison any day, even if I worked at minimum wage, and I’m not.

To be honest, it’s totally fair, and necessary. These tools won’t be investor subsidized forever, and the earlier we’re building actual cost into our expectations and business models, the better for everyone. Including our clients and their expectations what AI can do to timelines and budgets.

Maybe for some time they’ll leave it in the Max tier. But honestly, and there has been reporting to that extend, the only fair pricing for Claude Code and any agentic use is by API token meter, the same way AWS works. And we all know how quickly the AWS bills can get high.

So enjoy the brief Christmas gift while it lasts. Build those tools quickly.

I’m also glad to hear that an AI corp is charging something close to actual costs or for any profits. Yes, the investors will get tired of the CEOs’ schtick.

I’m not currently aware of an AI- based corp that has turned any profit?

Currently, in order for OpenAI to turn a profit, a subscription for every human on earth wouldn’t be sufficient. They would still be in the red. The only way they could go plus would be if they had the revenue from replacing all humans with agents and assumed that revenue.

Pretty sure that subscriptions aren’t the end game.

It could work if the monthly subscription fee for each user on Earth was just high enough.

Golder than gold - not fake money - promise

You just need more tokens, more value, more power…

Well, I think we should mostly consider well run businesses, not personal ego experiments.

I do think the current subscriptions could continue to work at some other price point ($35/mo?) for the everyday chatbot usage that people have enjoyed. From smarter search engine use cases, to light document editing, etc.

It’s the agentic use case that breaks the model. But it’s also those agentic use cases that enable the replacement of workers the CEOs have been dreaming about. The really interesting question is in the end: In the foreseeable future will agentic use cases be cheaper than the current human delivery mechanisms, and is it therefore reasonable or ludicrous to fire the humans and replace them (and you can build in some kind of yet unknown improvement factor for technology optimization)?

Here are some estimates to compare the use cases:

Typical chatbot engagement 500-1,700 token, even multi-turn conversation rarely exceeds 20-30K tokens.

Average Claude Code session (reading 5 files, refactoring some code, testing) might come in at 100-200K tokens, Agentic research task somewhere between 100-500K tokens.

Lets say an average subscriber today uses 10 chatbot engagements. That might might max out at 50K tokens per day.

Let say an engineer completes 30 Claude Code sessions in an 8hr day to fix bugs in a Git repo. That could average out at 10M tokens, and has potential go way higher.

That’s 200X the usage. Depending on model used that is somewhere between $100-$150 per day. Or up to $2K/month, which is 100X your current subscription cost.

By the time you look at advanced users refactoring large code bases and developing new apps, likely running multiple tasks in parallel. One example estimate for a 3-agent 8hr day of heavy refactoring can be in the 100-300M/day range and can come in at maybe $200-250/day.

The answer to the question above then is: $250 of tokens/day plus the $1,400 of fully loaded cost of a senior dev ($350K/yr) is still cheaper than than 3-5 days of $1,400 of equivalent non-AI productivity. It remains favorable, and a human remains in the driver seat with an AI accelerant, although smaller overall workforce.

When you look at the broader white collar worker the math gets different, there you’re replacing $210 in labor with $150 in AI and the human can be redundant as their org counterparts self-services. Once you factor in vacations, turn-over, training costs, etc. the machine does win.

OK, this was a bit of a detour through numbers. If it will start costing you $150/day to vibe code your tools, will you keep doing it? For those of us who build pipeline tools that pay off long-term in productivity and competitiveness. In many cases yes. Will all of you here pay $150/day to tinker with cool stuff - probably not.

I think what we as Flame artists have to be prepared for to budget $25-$150+ per day in AI costs for the work we’ll be doing. Add to that considerable kit cost or cloud kit (e.g RunComfy). That is still worth it, but needs to be built into our math rather than just absorbing it in our day rate, especially since there is significant variability.

So AI is here to stay. But the free lunches and Donut Fridays (tip to the 90s tech days) are over.

It’ll be even cheaper though. A senior dev won’t be able to command that kind of salary anymore. You could easily cut that salary number in half (or go lower, just needs to be above an electrician, but for most intents and purposes most 45 year old senior devs aren’t going to be starting up apprenticeships) and still have people piling up at your interviews due to supply and demand. That’d be across the board for all knowledge work. So that throws a bit of a wrench in the charge more per hour for flame work because you need a few thousand a year for AI tools. If you shrink the need for bodies in seats more people are angling for the same chair and salaries inevitably stagnate (best case scenario) or fall.

The most humane (read: mundane) approach would be taking salary/ rate of worker and subtracting from that cost of tokens. But even as sad and quasi-dystopian as that sounds, I think that’s a best case scenario and in reality you’d have the labor market do what it does and just make folks fight for scraps.

At the surface that seems logical. At least for the time being that dynamic will likely be different.

There will definitely be a surplus of devs, including senior devs. Since '22 the tech industry has shed 900,000 jobs, which is a huge reversal of fortunes and a ‘thanks for bringing the cake and don’t let the door hit you on the way out’.

But the devs that have figured out how to use AI properly are in high demand at least for the foreseeable future. Well documented in this story. Out of all the senior devs in that org, only a single one grokked it. They wanted him to teach the others, but he was too valuable for shipping products using his capability, he couldn’t be spared to teach the rest how to fish.

The same will be true for Flame and AI. Conversations I’ve seen and am aware off, is that right now Flame artists who know how to talk about AI, and in fact not just ‘this model over that model’ talk, but the impact on production in the bigger picture (e.g. commercial licensens, manage risk, wrangle customer expectations) are in high demand.

Lastly, it may seem hard to believe that a 45 year old senior dev might become an electrician, and even accept an apprenticeship. That is and will absolutely happen with a reasonable number. This has happened in the past. I know people who have exited tech and learned how to raise cattle and do all sorts of jobs. Sometimes referred to as ABC jobs in tech jargon. A certain percentage of the workforce will adapt and reconfigure in interesting ways. But we also know lots of people who will freeze in place, unable to do so.

The caveat is that those that will choose to adapt, may also be the ones that still have a seat at the table. It all comes down to the same EQ can CQ bands.

And the backstop to all or most of that. If this doesn’t play out, mainly because change in AI is more rapid than people and the economy can absorb, the AI companies will be slowed down by a flailing economy that cannot afford the services (not the core AI, but all the services companies offer who are clients of the AI).

I don’t think we disagree about the immediate future. The key word you used above is “right now.” I think 5 years from now will be very revealing. Either it’s the corporate board room fantasy and jobs for knowledge workers are drastically less, in which case I don’t see how wages don’t stagnate or fall (already the trend of were honest, but worse) or as you said, there isn’t the economy to prop this up and it’s just not the large scale transformative moment that’s being anticipated.

Just a semi-random data point… I did some Claude Code work on a matchbox shader and accompanying python script to automate a workflow. Maybe a bit more than a half day’s work with coding breaks while researching and thinking.

Total token usage: 27M tokens, out of which 25.6M where cach read tokens, 476K output tokens.

Total cost if it weren’t in the subscription: $16.96

This does not include some research queries I did in the web browser. That’s just the extensive conversation I had with Claude Code.

For what I got from it, totally worth it. But that’s extra expense we don’t budget for so far.

ps: if you’re using Claude code on Rocky, run ‘npx ccusage@latest’ to get your numbers

A few things worth experimenting with are using the /clear command to reset a session context selectively. Being explicit that you want ‘concise’ or ‘code only’ answers to skip some fluff. And of course picking the appropriate model for the complexity of problem you’re solving.

One thing that has already made a difference is that I iterate over the .md file a few times asking Claude to check it and clarify things, before telling it to generate code. Those are quick turn prompts working out obvious kinks without wasting coding time on them.

Similarly when debugging code afterwards, reading the code, using git diff to see what changed, and forming a mental model of the debugging rather than have Claude throw mud at the wall. Being efficient with everyone’s time, but having Claude try out different things, and document data/findings.

If it’s a more functional piece of code, make unit testing part of the .md document and instructions. While unit tests will burn tokens to build and run, they can save more costly high level debugging sessions, and are mostly automated.

Along the way, as you discover and make changes, keep updating the .md to reflect that (you can ask Claude to do that) so that remains your master plan.

I can confirm that software coding with AI will be very expensive for individual use. GitHub Copilot, Claude, Cursor, OpenAI…all of them. Thats why I did build a working solution on my own, ready to launch somehow mid-end of May… And here one reason why I do not use claude:

Came across this today… Nice crisp instructions (read bottom up). Reflects my experience. Spent all day doing prompts for Seedance, and had to reign Claude in a few times because it over optimized and broke things, had to send it back to re-read what it had done right.

Same applies to coding tasks.

The more we work with these, the more their quirks become apparent, and how to shape and control them.

I thought I was done raising teenagers. Apparently not.

https://medium.com/gitconnected/the-4-lines-every-claude-md-needs-2717a46866f6?sk=v2%2F43a378e6-c9d4-489d-a17b-59c459603e73

Refers/talks about this Git repo: andrej-karpathy-skills/CLAUDE.md at main · forrestchang/andrej-karpathy-skills · GitHub