The way I am doing the math with my Max subscription and assuming DeepSeek API prices, it is still x5 times cheaper. So either DeepSeek is losing money (unlikely) or Anthropic is losing lots of money (more likely). Grok kinda confirms my suspicions. Assuming DeepSeek prices, I've probably spent north of $100 of Grok compute. I didn't pay Grok or Twitter a single cent. $100 is a lot of loss for a single user.
Claude API pricing has significant margin baked in. I think it's safe to assume that anthropic is getting 80% margin on their api and they are selling claude code for less than that.
To me, claude usually feels like a bumbling idiot. But in extremely rare cases it feels like a sentient super intelligence. I facetiously assumed that in those cases it ran on the correct RNG seed.
I'm also curious about this. Claude Code feels very expensive to me, but at the same time I don't have much perspective (nothing to compare it to, really, other than Codex or other agent editors I guess. And CC is way better so likely worth the extra money anyway)
Pretty easy to hit $100 an hour using Opus on API credits. The model providers are heavily subsidized, the datacenters appear to be too. If you look at the Coreweave stuff and the private datacenters it starts looking like the telecom bubble. Even Meta is looking to finance datacenter expansion - https://www.reuters.com/business/meta-seeks-29-billion-priva...
The reason they are talking about building new nuclear power plants in the US isn't just for a few training runs, its for inference. At scale the AI tools are going to be extremely expensive.
Also note China produces twice as much electricity as the United States. Software development and agent demand is going to be competitive across industries. You may think, oh I can just use a few hours of this a day and I got a week of work done (happens to me some days), but you are going to end up needing to match what your competitors are doing - not what you got comfortable with. This is the recurring trap of new technology (no capitalism required.)
There is a danger to independent developers becoming reliant on models. $100-$200 is a customer acquisition cost giveaway. The state of the art models probably will end up costing hourly what a human developer costs. There is also the speed and batching part. How willing is the developer to, for example, get 50% off but maybe wait twice as long for the output. Hopefully the good dev models end up only costing $1000-$2000 a month in a year. At least that will be more accessible.
Somewhere in the future these good models will run on device and just cost the price of your hardware. Will it be the AGI models? We will find out.
I wonder how this comment will age, will look back at it in 5 or 10 years.
Your excellent comments make me grateful that I am retired and just work part time on my own research and learning. I believe you when you say professional developers will need large inference compute budgets.
Probably because I am an old man, but I don’t personally vibe with full time AI assistant use, rather I will use the best models available for brief periods on specific problems.
Ironically, when I do use the best models available to me it is almost always to work on making weaker and smaller models running on Ollama more effective for my interests.
BTW, I have used neural network tech in production since 1985, and I am thrilled by the rate of progress, but worry about such externalities as energy use, environmental factors, and hurting the job market for many young people.
I've been around for a while (not quite retirement age) and this time is the closest to the new feeling I had using the internet and web in the early days. There are simultaneously infinite possibilities but also great uncertainty what pathways will be taken and how things will end up.
There are a lot of parts in the near term to dislike here, especially the consequences for privacy, adtech, energy use. I do have concerns that the greatest pitfalls in the short terms are being ignored while other uncertainties are being exaggerated. (I've been warning on deep learning model use for recommendation engines for years, and only a sliver of people seem to have picked up on that one, for example.)
On the other hand, if good enough models can run locally, humans can end up with a lot more autonomy and choice with their software and operating systems than they have today. The most powerful models might run on supercomputers and just be solving the really big science problems. There is a lot of fantastic software out there that does not improve by throwing infinite resources at it.
Another consideration is while the big tech firms are spending (what will likely approach) hundreds of billions of dollars in a race to "AGI", what matters to those same companies even more than winning is making sure that the winner isn't a winner takes all. In that case, hopefully the outcome looks more like open source.
The SOTA models will always run in data centers, because they have 5x or more VRAM and 10-100x the compute allowance. Plus, they can make good use of scaling w/ batch inference which is a huge power savings, and which a single developer machine doesn’t make full use of.
Yes I do, it’s just for new comers who are used to cursor where, without careful prompting you just lock yourself out of premium requests, that’s not immediately a given that CC is more dangerous and does not work the same way at all.
Of course it requires careful planning but the trap is easy to fall into.
In my experience, it's more about the tool's local indexing and aggressive automatic upload and model usage limitations to avoid them (and you) overpaying.
People are recreating this with local toolchains now.
This is around what what Cursor was costing me with Claude 4 Opus before I switched to Claude Code. Sonnet works fine for some things, but for some projects it spews unusable garbage, unless the specification is so detailed that it's almost the implementation already.
This is where something like Perplexity's "memory" feature is really great. It treats other threads similarly to web resources.
I would love to understand better just how Perplexity is able to integrate up-to-date sources like other theads (and presumably recent web searches, but I haven't verified this, they could be just from the latest model) into it's query responses. It feels seamless.
Have you been human before? competition for resources and status is an instinctive trait.
It rears its head regardless of what sociopolitical environment you place us in.
You’re either competing to offer better products or services to customers…or you’re competing for your position in the breadline or politburo via black markets.
Even in the Soviet Union there were multiple design bureaus competing for designs of things like aircraft. Tupolec, Ilyushin, Sukhoi, Mikoyan-Gyurevich (MiG), Yakolev, Mil. There were quite a lot. Several (not all, they had their specialisations) provided designs when a requirement was raised. Not too different from the US yet not capitalist.
Not really, it's possible with any market economy, even a hypothetical socialist one (that is, one where all market actors are worker-owned co-ops).
And, since there is no global super-state, the world economy is a market economy, so even if every state were a state-owned planned economy, North Korea style, still there would exist this type of competition between states.
Consider also that VC funds often have pension funds as their limited partners. Workers have a claim to their pension, and thus a claim to the startup returns that the VC invests in.
So yeah it basically comes down to your definition of "worker-owned". What fraction of worker ownership is necessary? Do C-level execs count as workers? Can it be "worker-owned" if the "workers" are people working elsewhere?
Beyond the "worker-owned" terminology, why is this distinction supposed to matter exactly? Supposing there was an SV startup that was relatively generous with equity compensation, so over 50% of equity is owned by non-C-level employees. What would you expect to change, if anything, if that threshold was passed?
> Supposing there was an SV startup that was relatively generous with equity compensation, so over 50% of equity is owned by non-C-level employees. What would you expect to change, if anything, if that threshold was passed?
If the workers are majority owners, then they can, for example, fire a CEO that is leading the company in the wrong direction, or trying to cut their salaries, or anything like that.
>If the workers are majority owners, then they can, for example, fire a CEO that is leading the company in the wrong direction, or trying to cut their salaries, or anything like that.
Why wouldn't the board fire said CEO?
The most common reason to cut salaries is if the company is in dire financial straits regardless. Co-ops are more likely to cut salary and less likely to do layoffs.
Because the board doesn't understand the business at the level that employees do. Or because the board has different goals for the business than employees do. Or because the board is filled with friends of the CEO who let them do whatever.
Also, lots of companies reduce salaries or headcount if they feel they can get away with it. They don't need to be in dire financial straights, it's enough to have a few quarters of no or low growth and to want to show a positive change.
How specifically would you expect a typical SV corp's policies to change if employee equity passes from 49% to 51%?
Remember, if employees own 49%, if they can persuade just 2% of the other shareholders that a change will be positive for the business, they can make that change. So minority vs majority is not as significant as it may seem.
Can you give me an idea of how much interaction would be $50-$100 per day? Like are you pretty constantly in a back and forth with CC? And if you wouldn’t mind, any chance you can give me an idea of productivity gains pre/post LLM?
Yes, a lot of usage, I’d guess top 10% among my peers. I do 6-10hrs of constant iterating across mid-size codebases of 750k tokens. CC is set to use Opus by default, which further drives up costs.
Estimating productivity gains is a flame war I don’t want to start, but as a signal: if the CC Max plan goes up 10x in price, I’m still keeping my subscription.
I maintain top-tier subscription to every frontier service (~$1k/mo) and throughout the week spend multiple hours with each of Cursor, Amp, Augment, Windsurf, Codex CLI, Gemini CLI, but keep on defaulting to Claude Code.
I am curious what kind of development you’re doing and where your projects fall on the fast iteration<->correctness curve (no judgment). I’ve used CC Pro for a few weeks now and I will keep it, it’s fantastically useful for some things, but it has wasted more of my time than it saved when I’ve experimented with giving it harder tasks.
It's interesting to work with a number of people using various models and interaction modes in slightly different capacities. I can see where the huge productivity gains are and can feel them, but the same is true for the opposite. I'm pretty sure I lost a full day or more trying to track down a build error because it was relatively trivial fpr someone to ask CC or something to refactor a ton of files, which it seems to have done a bit too eagerly. On the other hand, that refactor would have been super tedious, so maybe worth it?
Mostly to save money (I am retired) I mostly use Gemini APIs. I used to also use good open weight models on groq.com, but life is simpler just using Gemini.
Ultimately, my not using the best tools for my personal research projects has zero effect on the world but I am still very curious what elite developers with the best tools can accomplish, and what capability I am ‘leaving on the table.’
I’m a founder/CTO of an enterprise SaaS, and I code everything from data modeling, to algos, backend integrations, frontend architecture, UI widgets, etc. All in TypeScript, which is perfectly suited to LLMs because we can fit the types and repo map into context without loading all code.
As to “why”: I’ve been coding for 25 years, and LLMs is the first technology that has a non-linear impact on my output. It’s simultaneously moronic and jaw-dropping. I’m good at what I do (eg, merged fixes into Node) and Claude/o3 regularly finds material edge cases in my code that I was confident in. Then they add a test case (as per our style), write a fix, and update docs/examples within two minutes.
I love coding and the art&craft of software development. I’ve written millions of lines of revenue generating code, and made millions doing it. If someone forced me to stop using LLMs in my production process, I’d quit on the spot.
Why not self host: open source models are a generation behind SOTA. R1 is just not in the same league as the pro commercial models.
> If someone forced me to stop using LLMs in my production process, I’d quit on the spot.
Yup 100% agree. I’d rather try to convince them of the benefits than go back to what feels like an unnecessarily inefficient process of writing all code by hand again.
And I’ve got 25+ years of solid coding experience. Never going back.
> data modeling, to algos, backend integrations, frontend architecture, UI widgets, etc. All in TypeScript, which is perfectly suited to LLMs because we can fit the types and repo map into context without loading all code.
Which frameworks & libraries have you found work well in this (agentic) context? I feel much of the js lib. landscape does not do enough to enforce an easily-understood project structure that would "constrain" the architecture and force modularity. (I might have this bias from my many years of work with Rails that is highly opinionated in this regard).
When you say generation behind, can you give a sense of what that means in functionality per your current use? Slower/lower quality, it would take more iterations to get what you want?
Context rot. My use case is iterating over a large codebase which quickly grows context. All LLMs degrade with larger context sizes, well below their published limits, but pro models degrade the least. R1 gets confused relatively quickly, despite their published numbers.
I think Fiction LiveBench captures some of those differences via a standardized benchmark that spreads interconnected facts through an increasingly large context to see how models can continue connecting the dots (similar to how in codebases you often have related ideas spread across many files)
> I’ve written millions of lines of revenue generating code
This is a wild claim.
Approx 250 working days in a year. 25 years coding. Just one million lines would be phenom output, at 160 lines per day forever. Now you are claiming multiple millions? Come on.
It's impossible as an IC on a team, or working where a concept of "tickets" exists. It's unavoidable as a solo founder, whether you're building enterprise systems or expanding your vision. Some details -
1. Before wife&kids, every weekend I would learn a library or a concept by recreating it from scratch. Re-implementing jQuery, fetch API via XHR, Promises, barebones React, a basic web router, express + common middlewares, etc. Usually, at least 1,000 lines of code every weekend. That's 1M+ over 25 years.
2. My last product is currently 400k LOCs, 95% built by me over three years. I didn't one-shot it, so assuming 2-3x ongoing refactors, that's more than 1M LOCs written.
3. In my current product repo, GitHub says for the last 6 months I'm +120k,-80k. I code less than I used to, but even at this rate, it's safely 100k-250k per year (times 20 years).
4. Even in open source, there are examples like esbuild, which is a side project from one person (cofounder and architect of Figma). esbuild is currently at ~150k LOCs, and GitHub says his contributions were +600k,-400k.
5. LOCs are not the same. 10k lines of algorithms can take a month, but 10K of React widgets is like a week of work (on a greenfield project where you know exactly what you're building). These days, when a frontend developer says their most extensive UI codebase was 100k LOCs in an interview, I assume they haven't built a big UI thing.
So yes, if the reference point is "how many sprint tickets is that", it seems impossible. If the reference point is "a creative outlet that aligns with startup-level rewards", I think my statement of "millions of lines" is conservative.
Granted, not all of it was revenue-generating - much was experimental, exploratory, or just for fun. My overarching point was that I build software products for (great) living, as opposed to a marketer who stumbled into Claude Code and now evangelizes it as some huge unlock.
No, it’s not. At all. At the overwhelming majority of companies I’ve worked for or heard of, even 400-500 lines fully shipped in a week, slightly less than your figure here, would be top quartile of output - but further, it isn’t necessarily the point. Writing lines of code is a pretty small part of the job at companies with more than about 5-6 engineers on staff, past that it’s a lot more design and architecture and LEGO-brick-fitting - or just politicking and policying. Heck, I know folks who wish they could ship 400 lines of code a month, but are held back by the bureaucracies of their companies.
Now extrapolate. That’s maybe 50k a year assuming some PTO.
10 years would make 500k and you just cross a million at 20.
So that would have to be 20 years straight of that style of working and you’re still not into plural millions until 40 years.
If someone actually produced multiple millions of lines in 25 years, it would have to be a side effect of some extremely verbose language where trivial changes take up many lines (maybe Java).
i've been using llm-based tools like copilot and claude pro (though not cc with opus), and while they can be helpful – e.g. for doc lookups, repetitive stuff, or quick reminders – i rarely get value beyond that. i've honestly never had a model surface a bug or edge case i wouldn’t have spotted myself.
i've tried agent-style workflows in copilot and windsurf (on claude 3.5 and 4), and honestly, they often just get stuck or build themselves into a corner. they don’t seem to reason across structure or long-term architecture in any meaningful way. it might look helpful at first, but what comes out tends to be fragile and usually something i’d refactor immediately.
sure, the model writes fast – but that speed doesn't translate into actual productivity for me unless it’s something dead simple. and if i’m spending a lot of time generating boilerplate, i usually take that as a design smell, not a task i want to automate harder.
so i’m honestly wondering: is cc max really that much better? are those productivity claims based on something fundamentally different? or is it more about tool enthusiasm + selective wins?
Unless you're getting paid for your commute, you're just giving your employer free productivity. I would recommend doing literally anything else with that time. Read a book, maybe.
If you can't do your job in your 8 hours then you're either not good enough or the requirements are too much and the company should change processes and hire.
Right, I'm not saying anyone should actually be in the office 40 hours a week that sounds terrible. And even with all the RTO of the last couple years that doesn't seem to be expected many places.
Personally I use dev containers on a server and I have written some template containers for quickly setting up new containers that has claude code and some scripts for easily connecting to the right container etc. Makes it possible to work on mobile,but lots of room for improvement in the workflow still.
The project is just a web backend. I give Claude Code grunt work tasks. Things like "make X operation also return Y data" or "create Z new model + CRUD operations". Also asking it to implement well-known patterns like denouncing or caching for an existing operation works well.
My app builds and runs fine on Termux, so my CLAUDE.md says to always run unit tests after making changes. So I punch in a request, close my phone for a bit, then check back later and review the diff. Usually takes one or two follow-up asks to get right, but since it always builds and passes tests, I never get complete garbage back.
There are some tasks that I never give it. Most of that is just intuition. Anything I need to understand deeply or care about the implementation of I do myself. And the app was originally hand-built by me, which I think is important - I would not trust CC to design the entire thing from scratch. It's much easier to review changes when you understand the overall architecture deeply.
you can easily reach 50$ per day.
by force switching model to opus
/model opus
it will continue to use opus eventhough there is a warning about approaching limit.
i found opus is significantly more capable in coding than sonnet, especcially for the task that is poorly defined, thinking mode can fulfill alot of missing detail and you just need to edit a little before let it code.
Claude Code with a Claude subscription is the cheap version for current SOTA.
"Agentic" workflows burn through tokens like there's no tomorrow, and the new Opus model is so expensive per-token that the Max plan pays itself back in one or two days of moderate usage. When people reports their Claude Code sessions costing $100+ per day, I read that as the API price equivalent - it makes no sense to actually "pay as you go" with Claude right now.
This is arguably the cheapest option available on the market right now in terms of results per dollar, but only if you can afford the subscription itself. There's also time/value component here: on Max x5, it's quite easy to hit the usage limits of Opus (fortunately the limit is per 5 hours or so); Max x20 is only twice the price of Max x5 but gives you 4x more Opus; better model = less time spent fighting with and cleaning up after the AI. It's expensive to be poor, unfortunately.
>less time spent fighting with and cleaning up after the AI.
I've yet to use anything but copilot in vscode, which is 1/2 the time helpful, and 1/2 wasting my time. For me it's almost break-even, if I don't count the frustration it causes.
I've been reading all these AI-related comment sections and none of it is convincing me there is really anything better out there. AI seems like break-even at best, but usually it's just "fighting with and cleaning up after the AI", and I'm really not interested in doing any of that. I was a lot happier when I wasn't constantly being shown bad code that I need to read and decide about, when I'm perfectly capable of writing the code myself without the hasle of AI getting in my way.
AI burnout is probably already a thing, and I'm close to that point already. I do not have hope that it will get much better than it is, as the core of the tech is essentially just a guessing game.
I tend to agree except for one recent experience: I built a quick prototype of an application whose backend I had written twice before and finally wanted to do right. But the existing infrastructure for it had bit-rotted, and I am definitely not a UI person. Every time I dive into html+js I have to spend hours updating my years-out-of-date knowledge of how to do things.
So I vibe coded it. I was extremely specific about how the back end should operate and pretty vague about the UI, and basically everything worked.
But there were a few things about this one: first, it was just a prototype. I wanted to kick around some ideas quickly, and I didn't care at all about code quality. Second, I already knew exactly how to do the hard parts in the back end, so part of the prompt input was the architecture and mechanism that I wanted.
But it spat out that html app way way faster than I could have.
Claude Code pro is ~$20USD/ month and is nearly enough for someone like me who can’t use it at work and is just playing around with it after work. I’m loving it.
cursor on a $20/month plan (if you burn thru the free credits) or gemini-cli (free) are 2 great ways to try out this kinda stuff for a hobbyist. you can throw in v0 too, $5/month free credits. susana’s free tier can give you a db as well.
Zed is fantastic. Just dipping my toes in agentic AI, but I was able to fix a failing test I spent maybe 15 minutes trying to untangle in a couple minutes with Zed. (It did proceed to break other tests in that file though, but I quickly reverted that.)
It is also BYOA or you can buy a subscription from Zed themselves and help them out. I currently use it with my free Copilot+ subscription (GitHub hands it out to pretty much any free/open source dev).
You can tell Claude Code to use opus using /model and then it doesn't fall back to Sonnet btw. I am on the $100 plan and I hit rate-limits every now and then, but not enough to warrant using Sonnet instead of Opus.
This is what I don’t get about the cost being reported by Claude code. At work I use it against our AWS Bedrock instance, and most sessions will say 15/20 dollars and I’ll have multiple agents running. So I can easily spend 60 bucks a day in reported cost. Our AWS Bedrock bill is only a small fraction of that? Why would you over charge on direct usage of your API?
Do you have a citation for this?
It might be at a loss, but I don’t think it is that extravagant.