As long as Codex remains so affordable and useful they do not have to slash prices, just keep Codex usable.
I keep meaning to try Claude Code, but I can't seem to run out of limits on Codex on regular pro plan.
Meanwhile all my friends on Claude Code are fighting the token limits every few hours.
I even switched to using extra high for easy medium level script tasks as a test and besides taking longer there was not much reduction in the token allowance.
I generally write a detailed spec before plan then possibly iterate a bit before implementation. Not sure what I am doing "wrong".
>As long as Codex remains so affordable and useful they do not have to slash prices, just keep Codex usable.
I imagine they track usage and can see whether their habitual users are switching to something else and aren't going to slash prices 'for the hell of it'.
I literally never hit usage limits in Claude Code on the $100 plan, and I feel like I'm using it as heavily as I can while still producing useful software. I could certainly make thinky machine go burr by giving it more busy work, but it wouldn't be good code or code that needs to exist.
And, I even use `claude -p` pretty regularly for scripted stuff (automated security vulnerability searches), which I thought was now counted at regular API rates, but that doesn't seem to ever run out either. I do only run one at a time, though...not parallel, so maybe it doesn't kick over into some "automation" mode of counting usage, I dunno.
I regularly hit usage limits on CC but thats when im in the zone, and do 5 things in parallel. Thats like 5 hours a week.
I also hit limits if I do something important, at which point I make it do a loop with significant subagent counts to just review and adjust the code extensively using a bunch of frameworks. Im perfectly happy with the CC limits of a max plan, it is never something that blocks me.and when it runs out im brain fried as well anyway so thats not an issue.
I agree, anecdotally I've only hit any usage limits on the $100 Max plan in the past couple of days using Fable 5. Evaluating old codebases eats a lot of tokens. But before Fable 5 I was running Opus 4.8 on max effort most of the day and rarely getting close to 50% usage. I rarely run dynamic workflows and sub-agents though.
The frontier labs commonly trade spots at the top of the benchmarks with each new model release.
The timing of these price cut discussions says to me OpenAI has no imminent release that will be edging out Mythos/Fable.
If so the question becomes when can they do so, or is this possibly a turning point where Anthropic keeps the crown to themselves for the foreseeable future.
> The timing of these price cut discussions says to me OpenAI has no imminent release that will be edging out Mythos/Fable.
Initially I had the same thought but I think this might actually have more to do with Fable being removed from the Claude subscription later this month. At that point it becomes cost prohibitive to use for most tasks anyways & this is the perfect opportunity to compete on price, especially given enterprise customers are already looking to improve spend management
I got a new $20 Claude subscription to try the new Fable model. I gave it a single prompt, and it barely finished, using up my whole session quota (it was at ~95% when it finished) and 10% of my weekly quota.
For comparison, with the Kimi Code $40 subscription I can pretty much constantly run two/three agents in parallel for the whole week, and I never run out of quota. I can blindly throw it at anything and everything without worrying about hitting the limits. (And it's not exactly a cheap model to run -- it has 1 trillion parameters!)
Is Kimi as good as Claude? Of course not. But you don't need the absolute state-of-art for most things. If I don't have exceptionally difficult tasks it makes no sense to use it. Just throw Kimi at it, and even if it needs to run 2 or 3 times longer in the background I don't care, because I'm not running out of tokens there.
I think it works best when you're using the agent in a more hands-on way with a targeted prompt. If you're obsessive about code quality like I am (so you thoroughly review and, when needed, reprompt or even rewrite what the agent does) then you'll be fine, but if you like to just throw a prompt at the wall and expect it to plan and execute the whole thing perfectly then you'll be disappointed.
A middle-ground trick one can use is to have Opus (or Fable now) plan the whole thing and get something cheaper like Kimi execute on it.
CodeWhale (formerly deepseek-tui) automates this over DeepSeek V4 Flash and Pro. My shallow understanding is that it prompts the model to evaluate the complexity of a given task, then decides on Flash vs. Pro at various reasoning levels for that task. This can help with both cost and speed. If other agent platforms don't already do this, I have to imagine they will at some point.
I'm retired and can't justify spending too much on these things. CodeWhale over DeepSeek is helping me understand this space much better (and have some fun!), and it's quite affordable. I've spent ~30 hours using it over the last couple of weeks, and I've spent $3.89 on DeepSeek in that time. If I don't feel like writing any code for a few weeks, I pay nothing. Looking at DeepSeek's dashboard, about 60% of my requests have gone to Pro and 40% to Flash. I've used 97M Pro tokens and 19M Flash tokens (well over 90% of each have been cache hits, so the price is much lower than it would otherwise be).
OTOH, using the best is a competitive advantage when time = money. It's like giving your engineers a slow laptop because it's cheaper. It may be cheaper but not worth the cost.
> OTOH, using the best is a competitive advantage when time = money. It's like giving your engineers a slow laptop because it's cheaper. It may be cheaper but not worth the cost.
That doesn't imply giving your devs the best laptop makes any difference.
How much more productive will your devs be if you upgrade them from a 32GB RAM, 8-core laptop to a 768GB RAM 96-core threadripper?
In your analogy, Kimi may not be the 4-core celeron with 4GB of RAM, it's more like the 8-core AMD with 32GB of RAM.
Unless your job is purely producing code pointlessly, this is not a really good comparison. Most of the time really is spent on understanding the problem and figuring out solutions, not waiting on CPU.
Not necessarily, inference speed also has huge time aspect. For example anthropic takes nearly twice as long as OpenAI models for my tasks with both having similar success rates.
It seems that OpenAI lacks a clear target audience, they try to be everything for everyone. Anthropic is targeting professionals / enterprise users.
I don’t fully understand why OpenAI lacks this focus, as clearly identifying a target market is one of the first things you do with a business strategy. But instead they just seem to throw stuff against the wall and see what sticks.
> I don’t fully understand why OpenAI lacks this focus, as clearly identifying a target market is one of the first things you do with a business strategy
I've been inside companies that have struggled with this, and the real internal story goes like this:
1. Surprise product growth
2. Revenue go brr, org expands
3. Everyone gets promoted as org expands
4. Because the product sold itself, there was little selection pressure on the sales / customer success orgs to evaluate their effectiveness
5. Leadership gets saturated with people who just aren't very good at their job
6. None of those people get fired/demoted, because the company never had to develop "What to do with a bad leader?" muscles
7. This eventually manifests as an increasing (customer) <-> (engineering) disconnect (as sales/cs aren't doing their job)
8. People begin to ask why the company isn't doing (insert obvious thing)
9. It's because VP-of-whatever is chasing fantasies instead of reporting customer needs to engineering
Tl;dr - Don't trust promotions made during the good times. Continuously reevaluate leaders.
I think this is too simplistic. Codex is increasingly useful for business usage. I use it for both technical stuff and doing non technical things with my inbox, google drive, etc. It's pretty good for that. And it's pretty clear that business users are very much untapped potential at this point. They need proper agents with tunable guard rails and all the rest.
It seems very competent at coding tasks as well. I don't think Anthropic has a huge edge on that front. It's more of a neck and neck race with proponents in both camps. I ignore most benchmarks at this point; I don't think they have much relevance for normal users.
I think it's actually necessary for both to try out different approaches. Nothing is set in stone yet when it comes to the UX of these things.
They have the consumer market but want the enterprise market, because it's a lot more lucrative, so they're probably going to just keep chasing that even though there's no signs they'll stop losing to Anthropic. They don't need to do that much to keep the consumer market because of momentum.
Questionable whether the enterprise market really is the most lucrative. The biggest of big tech all have significant revenue from the consumer market. Compare Apple, Google, Meta, to IBM, Salesforce, ServiceNow.
Enterprise market is paying by token and using a lot of tokens. Consumer market is paying a subscription that they can't raise too high or they'll lose users to competition. Seems to me that the enterprise market scales a lot higher.
Have you seen many corporations complaining and caping usage to 20-200usd per developer per month. I doubt will change much. Many are considering on premise now.
OpenAI actually never had a focus. Their VC pith was: once the AI is good enough, it will find our business model. They've raised money on that.
With that said you are right, it seems OpenAI got numbed by ChatGPT's initial success and tried to be the go-to brand for consumers... which is Google's playground.
Meanwhile, Anthropic led the B2B market with a clever segmented approach, and got well-paying customers.
Because they gained a HUGE amount of “normal” users and I think they feel desperate to monetise that. It’s their potential massive edge on competition, they just haven’t found any way to realise it and I suspect they won’t.
> If so the question becomes when can they do so, or is this possibly a turning point where Anthropic keeps the crown to themselves for the foreseeable future.
This specific crown (Best Performing Model) appears to be made out of thorns: pay 100x more for maybe a 10% improvement in capabilities.
It's simple I think - over time the price will go down. According to some analyses the price for equal intelligence declined 10-1000x per year, depending on the domain.
It probably won't be the same again but I still think we can bet on radically cheaper Mythos level intelligence in the future.
I don’t think Mythos/Fable matter in attracting customers. The typical use is not going to be on the most expensive model, especially with all its frustrating gotchas like refusing harmless prompts and forcing companies to have their data retained.
If OpenAI can offer an alternative to Opus but with better pricing, it will boost their revenue at Anthropic’s cost, in time for the IPO.
I have a really strong suspicion that there is something different about OAI prepaid tokens in the API vs elsewhere. I've been able to get away with spending less than $150/m on average while many peers are hitting 10x that.
I am curious how many on HN have manually configured their copilot install with a custom OAI token for 5.4/5.5. In my experience, the performance difference over the built in subscription models is immense. This setup tends to solve the problem so quickly and reliably that any desire to have it run while I'm asleep seems absolutely ridiculous. The performance is constant throughout the day and week.
I think what might be happening is that we are chasing the cost optimization rabbit a little bit too hard. Capability is weird dimension to quantify. A weaker model is not weaker in a linear way. It's usually this incredibly tall brick wall of a discrete go/no-go. If the model can't do the task, it doesn't matter how cheap the tokens are. Something approaching the inverse is also largely true.
Focus on the capability (is this giving my customer what they want) instead of the cost, and you will likely find that the cost never reaches a threshold where you even begin to worry about it. Starting from a position of cost optimization tends to spiral into a dark place.
The point I'm trying to make is the reason a lot of people are resorting to the 24/7 Ralph loops is because they're using weaker models that need an incredible number of attempts to make any progress. The Death Star has different game theoretic implications. You probably don't need it to be lasering entire planets while you sleep, assuming the laser system actually works as advertised. I've never had a copilot run that took so long that I had to get up from my PC. Maybe 10 minutes. What the hell can run for 24 hours and still converge in a meaningful way?
Competition is lovely. And ironically, OpenAI will probably get and keep lots of enterprise customers like Microsoft^ that won’t accept anything less of ZDR.
Reality is Fable is x2 price increase against previous.
GPT5.5 is x2 price increase against previous. And after the last week reset, codex is hungry for your sub allowance.
Everybody can see that the massive raises are not matching the revenue, at all.
It's a surprising headline. Yes it does make sense to cut the price to gain market share, but it also make sense to keep it at a sustainable level, which seems to not have been reached yet.
Fable is twice the size of Opus from what I gathered. So I'm not sure if 2x price translates to 2x profits as well.
Not sure about GPT but it seems plausible they've also been increasing the model size with recent releases. (Progressively training a bigger model and easing into a profitable price range for that model scale?)
I haven't used OpenAI for months since them supporting the warmongers officially, and I have to say not only don't I miss them - I barely think about them expect for their "please come back" emails from my account I haven't deleted yet unfortunately.
Amazon would absolutely take a loss on certain products in order to dominate the category, squeeze out competitors and then bring the price back up. It's one of the reasons they're so dominant in general now. Also one of the reasons why Amazon Basics has basically everything that exists and they're usually at or near the top of their respective categories -- third-party sellers simply can't compete.
I have 128 GB of unified memory (M4 Max) and the user experience with local inference is still pretty bad. I'm so glad something like llama.cpp exists so I don't have to wrangle Python (which I hate), but OpenCode is entirely disrespectful of the KV-cache so I had to switch to Pi (but Pi is going relatively well actually).
Even so, I can't really run at hundreds of tokens per second which is practically table stakes for my work. Even if I did manage to run that fast, the model would probably be completely braindead and stomp all over the task.
Wish I could afford an M5 Max but I've been between jobs for months without even a single interview. Sucks to be a developer these days.
I do use DeepSeek, it's exceptionally cheap! Inference is slow though, and it's not particularly intelligent but the experience is better than local inference.
To a certain extent, but not completely. OpenAI and Anthropic are taking losses on their entire offering—that is a huge difference. Amazon, for example, has pumped its profits back into R&D for decades. What AI companies are doing right now is running the Uber playbook on an epic scale. In the US, there isn't much competition, so they can maintain a duopoly. But look at what happened in China: Uber collapsed and pulled out.
Now, the entire world is facing competition from DeepSeek and Qwen at a fraction of the cost. According to a reliable Shenzhen source, they will halve their prices again by the end of this year using newer Huawei GPUs. The current 7nm chips are already bleeding OpenAI dry. By the end of this year, they will upgrade to 5nm, and by June of next year, 3nm. They don't even have to be better—just 95% as good at 1/20th of the price.
I don't see OpenAI and Anthropic surviving much outside of America; they are likely staring down Groupon’s fate. You can research it yourself: China has no issues with electricity because they have a massive power surplus. This is why OpenAI and Anthropic are so scared right now. They must IPO by 2027, because after that, they will suffer the exact same fate as Groupon.
I do love the DeepSeek models, they're so incredibly cheap and for functionality that nears Sonnet. Weeks of heavy usage still lands squarely under $10 for me.
Compare that with how I pay $200 a month for Claude and am still hitting the limits with any sort of sustained usage. They even have a special usage limit for Sonnet to prevent you from using too much of that either.
I'm super frustrated with how slow DeepSeek is though. And it's not nearly ready to be unsupervised for long periods of time like Claude is. Just this morning I left Fable 5 unsupervised for about eight hours straight. Single turn. DeepSeek often gets even much shorter turns wrong, so I wouldn't trust it with anywhere near that length of time alone. Not to mention it'd get so much less done because of how slow it is.
Also, did you use an LLM to correct your grammar after you posted? Lol
Increasingly it looks like it will end with a bubble bursting. LLMs and AI will survive, like the internet survived the dotcom bubble. But OpenAI and Anthropic could just be today's AOL and Yahoo.
I hope it will also crash hardware pricing so it becomes economically feasible to run your own local model. Currently I don’t like where we are heading with the sabotaging models because its “too dangerous”
> I hope it will also crash hardware pricing so it becomes economically feasible to run your own local model.
Even if you don't acquire hardware to do host local models, a hardware crash means that I should be able to rent the crashed hardware at just above cost of electricity + bandwidth.
Like the way I can now, for $7/m, rent a VPS that can run my B2B webapp for a company with 10k users, I look forward to buying a timeshare on GPUs that let me pay $12/m for all-you-can-eat GPU.
However, I think actually that while it won't give the results expected (AI agents run the company, build all features, etc.), it will nevertheless become a developer tool like IDEs, something "you have to have".
It's here to stay but probably with more realistic expectations than some CEO/CTO are pushing for (agents for everything, nobody writes 1 LOC, self healing systems, etc).
So the market expectations will be probably resized, but these tools are here to stay. Be it for cybersecurity (from CVEs to cyber warfare) alone, that's already worth all the money they are throwing a it.
They cut prices and get more customers who are going to move to the next vendor that cuts prices even more or when they jack up the prices again.
I am not complaining, I like my investor subsidised tokens, I don't see what these companies see as their end goal when it's becoming more and more possible to run a competent LLM locally(even with today's RAM prices).
I am surprised that there is no Claude or ChatGPT machine that I could buy, I feel like they should be opening up that model, but I guess subscriptions look better on balance sheets.
OpenAI seems like such a better product these days. I was someone who jumped to anthropic early on for claude code, but I find myself jumping in the other direction these days.
I completely don’t understand Anthropic’s pricing where you have to pay a monthly fee to access their crappy models and pay per use for access to their top model. If you’re going to go pay per use it should be actually pay per use.
This is the race to the bottom setup that will tank these companies in their attempts to IPO. They’re burning cash at current pricing and if a true price war breaks out the only way that ends is if either OpenAI or Anthropic blows up and goes away.
Right now OpenAI is looking like the one setup to fail here. They have lost momentum big time and are looking incredibly vunerable.
LLMs are quickly becoming a commodity. In a decade, the only reason anyone won't be running free models locally will be for corporate oversight or regulatory needs, so the successful providers won't be the ones that make the best product, but the ones that make the most compliant product.
Economies of scale, optimization of models, hardware, energy infrastructure, data center construction and operation, etc. Stuff is currently relatively inefficient and there's lots of room for optimization. All the usual stuff.
Lots of comparisons to eg. Amazon, and how both were burning money for ages.
Maybe the better comparison is Uber? I.e. a commoditised product (taxis on an app), burning money to directly subsidise and gain market share. I always thought it was utterly insane and a waste of money... But you'd be hard pressed to have not made money on Uber.
This is my understanding anyway. A LLM-generated summary suggests that anyone who invested pre-IPO got at least 8-10% annually compounded. Even Series G investors made 2.3x since then. It's not an Eldorado and has to make up for all the losers in the VC portfolio but it's money made, not a smouldering crater of losses.
And after going public, return from IPO is 9.4% compounded. Price is 40% below all time high in October 25 but hey that's a harsh criterion for a long term investment.
The reason why I think it's a good point of comparison is that there's no moat, plenty of competition, heavily subsidised for years by literally burning cash, now seemingly profitable and a reasonably sane PE ratio of 17.
Of course one difference is that a major cost item for LLM companies is building genuinely new, cutting edge engineering/science products whereas for Uber, I never understood why they need the 1000s of technical staff to deliver a taxi app.
I don't know about the ins and outs of the business models of either LLM providers or Uber but keen to hear from people who have insights.
We run monthly. It seems like every few months there's a reason to swap in/out of a particular vendor. Specifically I use all 3 pro then either chat or claude will have a 5x max depending if they're the good thing to be using during that those 3 months.
As long as Codex remains so affordable and useful they do not have to slash prices, just keep Codex usable.
I keep meaning to try Claude Code, but I can't seem to run out of limits on Codex on regular pro plan.
Meanwhile all my friends on Claude Code are fighting the token limits every few hours.
I even switched to using extra high for easy medium level script tasks as a test and besides taking longer there was not much reduction in the token allowance.
I generally write a detailed spec before plan then possibly iterate a bit before implementation. Not sure what I am doing "wrong".
>As long as Codex remains so affordable and useful they do not have to slash prices, just keep Codex usable.
I imagine they track usage and can see whether their habitual users are switching to something else and aren't going to slash prices 'for the hell of it'.
just look at public stats on openrouter (obviously not indicative of first party app usage or direct api usage, but there's a huge difference between these graphs): https://openrouter.ai/openai https://openrouter.ai/anthropic
I literally never hit usage limits in Claude Code on the $100 plan, and I feel like I'm using it as heavily as I can while still producing useful software. I could certainly make thinky machine go burr by giving it more busy work, but it wouldn't be good code or code that needs to exist.
And, I even use `claude -p` pretty regularly for scripted stuff (automated security vulnerability searches), which I thought was now counted at regular API rates, but that doesn't seem to ever run out either. I do only run one at a time, though...not parallel, so maybe it doesn't kick over into some "automation" mode of counting usage, I dunno.
I regularly hit usage limits on CC but thats when im in the zone, and do 5 things in parallel. Thats like 5 hours a week.
I also hit limits if I do something important, at which point I make it do a loop with significant subagent counts to just review and adjust the code extensively using a bunch of frameworks. Im perfectly happy with the CC limits of a max plan, it is never something that blocks me.and when it runs out im brain fried as well anyway so thats not an issue.
I agree, anecdotally I've only hit any usage limits on the $100 Max plan in the past couple of days using Fable 5. Evaluating old codebases eats a lot of tokens. But before Fable 5 I was running Opus 4.8 on max effort most of the day and rarely getting close to 50% usage. I rarely run dynamic workflows and sub-agents though.
The frontier labs commonly trade spots at the top of the benchmarks with each new model release.
The timing of these price cut discussions says to me OpenAI has no imminent release that will be edging out Mythos/Fable.
If so the question becomes when can they do so, or is this possibly a turning point where Anthropic keeps the crown to themselves for the foreseeable future.
> The timing of these price cut discussions says to me OpenAI has no imminent release that will be edging out Mythos/Fable.
Initially I had the same thought but I think this might actually have more to do with Fable being removed from the Claude subscription later this month. At that point it becomes cost prohibitive to use for most tasks anyways & this is the perfect opportunity to compete on price, especially given enterprise customers are already looking to improve spend management
At the right price, these model don't need to be the best, good enough will do. I think we're fast approaching good enough for most users.
This. Here's a quick experiment I did yesterday.
I got a new $20 Claude subscription to try the new Fable model. I gave it a single prompt, and it barely finished, using up my whole session quota (it was at ~95% when it finished) and 10% of my weekly quota.
For comparison, with the Kimi Code $40 subscription I can pretty much constantly run two/three agents in parallel for the whole week, and I never run out of quota. I can blindly throw it at anything and everything without worrying about hitting the limits. (And it's not exactly a cheap model to run -- it has 1 trillion parameters!)
Is Kimi as good as Claude? Of course not. But you don't need the absolute state-of-art for most things. If I don't have exceptionally difficult tasks it makes no sense to use it. Just throw Kimi at it, and even if it needs to run 2 or 3 times longer in the background I don't care, because I'm not running out of tokens there.
A word of caution on this.
I've tried this too, and was disappointed.
Kimi generally benchmarks at "a bit more intelligent than Sonnet Medium" levels[1] and I'd agree broadly with this assessment.
If you have adapted your coding to rely on the agentic style that is doable in Opus 4.7+ then you will find Kimi disappointing.
If you are using it in a more targeted way then it can work well.
[1] https://artificialanalysis.ai/agents/coding-agents?agents=cl...
Yes, I would agree with this.
I think it works best when you're using the agent in a more hands-on way with a targeted prompt. If you're obsessive about code quality like I am (so you thoroughly review and, when needed, reprompt or even rewrite what the agent does) then you'll be fine, but if you like to just throw a prompt at the wall and expect it to plan and execute the whole thing perfectly then you'll be disappointed.
A middle-ground trick one can use is to have Opus (or Fable now) plan the whole thing and get something cheaper like Kimi execute on it.
CodeWhale (formerly deepseek-tui) automates this over DeepSeek V4 Flash and Pro. My shallow understanding is that it prompts the model to evaluate the complexity of a given task, then decides on Flash vs. Pro at various reasoning levels for that task. This can help with both cost and speed. If other agent platforms don't already do this, I have to imagine they will at some point.
I'm retired and can't justify spending too much on these things. CodeWhale over DeepSeek is helping me understand this space much better (and have some fun!), and it's quite affordable. I've spent ~30 hours using it over the last couple of weeks, and I've spent $3.89 on DeepSeek in that time. If I don't feel like writing any code for a few weeks, I pay nothing. Looking at DeepSeek's dashboard, about 60% of my requests have gone to Pro and 40% to Flash. I've used 97M Pro tokens and 19M Flash tokens (well over 90% of each have been cache hits, so the price is much lower than it would otherwise be).
Not only that, it's easy to let ethics steer my choice as well. And at this point I suspect OpenAI will never earn my respect.
Yeah, that's how I feel too. I am totally fine with xHigh GPT 5.5 when it comes to coding.
OTOH, using the best is a competitive advantage when time = money. It's like giving your engineers a slow laptop because it's cheaper. It may be cheaper but not worth the cost.
> OTOH, using the best is a competitive advantage when time = money. It's like giving your engineers a slow laptop because it's cheaper. It may be cheaper but not worth the cost.
That doesn't imply giving your devs the best laptop makes any difference.
How much more productive will your devs be if you upgrade them from a 32GB RAM, 8-core laptop to a 768GB RAM 96-core threadripper?
In your analogy, Kimi may not be the 4-core celeron with 4GB of RAM, it's more like the 8-core AMD with 32GB of RAM.
768GB seems oddly specific for Kimi
Unless your job is purely producing code pointlessly, this is not a really good comparison. Most of the time really is spent on understanding the problem and figuring out solutions, not waiting on CPU.
Not necessarily, inference speed also has huge time aspect. For example anthropic takes nearly twice as long as OpenAI models for my tasks with both having similar success rates.
It seems that OpenAI lacks a clear target audience, they try to be everything for everyone. Anthropic is targeting professionals / enterprise users.
I don’t fully understand why OpenAI lacks this focus, as clearly identifying a target market is one of the first things you do with a business strategy. But instead they just seem to throw stuff against the wall and see what sticks.
> I don’t fully understand why OpenAI lacks this focus, as clearly identifying a target market is one of the first things you do with a business strategy
Resource curse: https://en.wikipedia.org/wiki/Resource_curse
I've been inside companies that have struggled with this, and the real internal story goes like this:
Tl;dr - Don't trust promotions made during the good times. Continuously reevaluate leaders.I think this is too simplistic. Codex is increasingly useful for business usage. I use it for both technical stuff and doing non technical things with my inbox, google drive, etc. It's pretty good for that. And it's pretty clear that business users are very much untapped potential at this point. They need proper agents with tunable guard rails and all the rest.
It seems very competent at coding tasks as well. I don't think Anthropic has a huge edge on that front. It's more of a neck and neck race with proponents in both camps. I ignore most benchmarks at this point; I don't think they have much relevance for normal users.
I think it's actually necessary for both to try out different approaches. Nothing is set in stone yet when it comes to the UX of these things.
They have the consumer market but want the enterprise market, because it's a lot more lucrative, so they're probably going to just keep chasing that even though there's no signs they'll stop losing to Anthropic. They don't need to do that much to keep the consumer market because of momentum.
Questionable whether the enterprise market really is the most lucrative. The biggest of big tech all have significant revenue from the consumer market. Compare Apple, Google, Meta, to IBM, Salesforce, ServiceNow.
Enterprise market is paying by token and using a lot of tokens. Consumer market is paying a subscription that they can't raise too high or they'll lose users to competition. Seems to me that the enterprise market scales a lot higher.
it's really not much compared to the amount they are spending on training. 100 developers at $200 per month is just $20000
$240,000
Have you seen many corporations complaining and caping usage to 20-200usd per developer per month. I doubt will change much. Many are considering on premise now.
Apple, sure. But Google and Meta are really advertising companies, whose income stream comes from enterprises, big and small.
Not to mention Google Cloud.
OpenAI actually never had a focus. Their VC pith was: once the AI is good enough, it will find our business model. They've raised money on that.
With that said you are right, it seems OpenAI got numbed by ChatGPT's initial success and tried to be the go-to brand for consumers... which is Google's playground.
Meanwhile, Anthropic led the B2B market with a clever segmented approach, and got well-paying customers.
Because they gained a HUGE amount of “normal” users and I think they feel desperate to monetise that. It’s their potential massive edge on competition, they just haven’t found any way to realise it and I suspect they won’t.
They keep asking chatgpt how to monetize and it keeps giving slop answers?
> If so the question becomes when can they do so, or is this possibly a turning point where Anthropic keeps the crown to themselves for the foreseeable future.
This specific crown (Best Performing Model) appears to be made out of thorns: pay 100x more for maybe a 10% improvement in capabilities.
Not sure what the goal is, here.
It's simple I think - over time the price will go down. According to some analyses the price for equal intelligence declined 10-1000x per year, depending on the domain.
It probably won't be the same again but I still think we can bet on radically cheaper Mythos level intelligence in the future.
The benchmark is not everything, the LLMs have their “personality” and GPT is annoying AF.
Also, I don’t about others, but I personally strongly dislike OpenAI’s leadership’s hypocrisy. I find them losing the race highly satisfying.
I don’t think Mythos/Fable matter in attracting customers. The typical use is not going to be on the most expensive model, especially with all its frustrating gotchas like refusing harmless prompts and forcing companies to have their data retained.
If OpenAI can offer an alternative to Opus but with better pricing, it will boost their revenue at Anthropic’s cost, in time for the IPO.
I have a really strong suspicion that there is something different about OAI prepaid tokens in the API vs elsewhere. I've been able to get away with spending less than $150/m on average while many peers are hitting 10x that.
I am curious how many on HN have manually configured their copilot install with a custom OAI token for 5.4/5.5. In my experience, the performance difference over the built in subscription models is immense. This setup tends to solve the problem so quickly and reliably that any desire to have it run while I'm asleep seems absolutely ridiculous. The performance is constant throughout the day and week.
I think what might be happening is that we are chasing the cost optimization rabbit a little bit too hard. Capability is weird dimension to quantify. A weaker model is not weaker in a linear way. It's usually this incredibly tall brick wall of a discrete go/no-go. If the model can't do the task, it doesn't matter how cheap the tokens are. Something approaching the inverse is also largely true.
Focus on the capability (is this giving my customer what they want) instead of the cost, and you will likely find that the cost never reaches a threshold where you even begin to worry about it. Starting from a position of cost optimization tends to spiral into a dark place.
> any desire to have it run while I'm asleep seems absolutely ridiculous.
could that be the difference from your peers? :p (real question b/c if you brought it up you're probably seeing others do it)
The point I'm trying to make is the reason a lot of people are resorting to the 24/7 Ralph loops is because they're using weaker models that need an incredible number of attempts to make any progress. The Death Star has different game theoretic implications. You probably don't need it to be lasering entire planets while you sleep, assuming the laser system actually works as advertised. I've never had a copilot run that took so long that I had to get up from my PC. Maybe 10 minutes. What the hell can run for 24 hours and still converge in a meaningful way?
Competition is lovely. And ironically, OpenAI will probably get and keep lots of enterprise customers like Microsoft^ that won’t accept anything less of ZDR.
[1] https://www.theverge.com/report/947575/microsoft-claude-fabl...
ZDR = Zero Data Retention
It's all speculation.
Reality is Fable is x2 price increase against previous.
GPT5.5 is x2 price increase against previous. And after the last week reset, codex is hungry for your sub allowance.
Everybody can see that the massive raises are not matching the revenue, at all.
It's a surprising headline. Yes it does make sense to cut the price to gain market share, but it also make sense to keep it at a sustainable level, which seems to not have been reached yet.
Fable is twice the size of Opus from what I gathered. So I'm not sure if 2x price translates to 2x profits as well.
Not sure about GPT but it seems plausible they've also been increasing the model size with recent releases. (Progressively training a bigger model and easing into a profitable price range for that model scale?)
it looks like all AI is following the same pattern as GPT-3 again, building bigger models to achieve better results.
Which is actually good news if it still works
I haven't used OpenAI for months since them supporting the warmongers officially, and I have to say not only don't I miss them - I barely think about them expect for their "please come back" emails from my account I haven't deleted yet unfortunately.
This pre-IPO battle is very entertaining. Curious how it all ends
Bankruptcy? How much longer can they sell a dollar for 50c?
How long did AMZN do it for?
Amazon lost a cumulative 2.8 billion over their first 17 quarters.
So if you're asking about time, then amazon stopped a lot faster. OpenAI is 40 quarters old.
If you are asking about money, then amazon... also stopped a lot faster. OpenAI is losing money comparable to amazon's lifetime losses every quarter.
I don’t think Amazon did that…?
Amazon was pretty famous for never actually posting a profit for their first ~10 years of operation
> Amazon was pretty famous for never actually posting a profit for their first ~10 years of operation
They were spending the profit from each user, not making a loss on each user.
It's a big difference.
To turn a profit all AMZN had to do was stop spending (and the consumers would not have been affected by the halting of spending).
For the AI providers, to turn a profit they have to raise the price.
It’s the same here. Inference alone is profitable. It’s the R&D cost of making a new model that drives up expenses.
Because they were constantly reinvesting profits...
Amazon would absolutely take a loss on certain products in order to dominate the category, squeeze out competitors and then bring the price back up. It's one of the reasons they're so dominant in general now. Also one of the reasons why Amazon Basics has basically everything that exists and they're usually at or near the top of their respective categories -- third-party sellers simply can't compete.
Amazon wasn't competing against open and free models that are starting to be good enough running on existing laptops.
OpenAI and Anthropic's moat is filling with cement faster than they can dig.
I have 128 GB of unified memory (M4 Max) and the user experience with local inference is still pretty bad. I'm so glad something like llama.cpp exists so I don't have to wrangle Python (which I hate), but OpenCode is entirely disrespectful of the KV-cache so I had to switch to Pi (but Pi is going relatively well actually).
Even so, I can't really run at hundreds of tokens per second which is practically table stakes for my work. Even if I did manage to run that fast, the model would probably be completely braindead and stomp all over the task.
Wish I could afford an M5 Max but I've been between jobs for months without even a single interview. Sucks to be a developer these days.
Try Kilocode with deepseek v4 (via API directly to deepseek, much cheaper than via kilo).
I have had very good results and compared to others it just costs pennies.
I use something similar to this https://github.com/ScotterMonk/AgentAutoFlow setup and switch between deepseek v4 to flash depending on task.
Deepseek Flash v4 actually runs on 128Gb systems (about 14 tok/sec). Antirez created a fabulous 2 bit quant and a highly tuned LLM server
https://github.com/antirez/ds4
I do use DeepSeek, it's exceptionally cheap! Inference is slow though, and it's not particularly intelligent but the experience is better than local inference.
To a certain extent, but not completely. OpenAI and Anthropic are taking losses on their entire offering—that is a huge difference. Amazon, for example, has pumped its profits back into R&D for decades. What AI companies are doing right now is running the Uber playbook on an epic scale. In the US, there isn't much competition, so they can maintain a duopoly. But look at what happened in China: Uber collapsed and pulled out. Now, the entire world is facing competition from DeepSeek and Qwen at a fraction of the cost. According to a reliable Shenzhen source, they will halve their prices again by the end of this year using newer Huawei GPUs. The current 7nm chips are already bleeding OpenAI dry. By the end of this year, they will upgrade to 5nm, and by June of next year, 3nm. They don't even have to be better—just 95% as good at 1/20th of the price. I don't see OpenAI and Anthropic surviving much outside of America; they are likely staring down Groupon’s fate. You can research it yourself: China has no issues with electricity because they have a massive power surplus. This is why OpenAI and Anthropic are so scared right now. They must IPO by 2027, because after that, they will suffer the exact same fate as Groupon.
I do love the DeepSeek models, they're so incredibly cheap and for functionality that nears Sonnet. Weeks of heavy usage still lands squarely under $10 for me.
Compare that with how I pay $200 a month for Claude and am still hitting the limits with any sort of sustained usage. They even have a special usage limit for Sonnet to prevent you from using too much of that either.
I'm super frustrated with how slow DeepSeek is though. And it's not nearly ready to be unsupervised for long periods of time like Claude is. Just this morning I left Fable 5 unsupervised for about eight hours straight. Single turn. DeepSeek often gets even much shorter turns wrong, so I wouldn't trust it with anywhere near that length of time alone. Not to mention it'd get so much less done because of how slow it is.
Also, did you use an LLM to correct your grammar after you posted? Lol
> I do love the DeepSeek models, they're so incredibly cheap and for functionality that nears Sonnet.
> I'm super frustrated with how slow DeepSeek is though. And it's not nearly ready to be unsupervised for long periods of time like Claude is.
Tradeoffs ;). One thing I'm doing is to make my flows properly available on my phone, so I can run and supervise things wherever I may be.
Whole SV is subsidizing things to kill competitiom it all the time. L
However, Amazon was not racking debt the way these companies are. Both their behavior and financials were miles apart from these ai companies.
Subsidizing is indeed different than burning cash, for sure.
That's easy.
More tokens and bigger models pre-ipo to attract attention, limit everything post-ipo.
They did it before, will do it after.
Increasingly it looks like it will end with a bubble bursting. LLMs and AI will survive, like the internet survived the dotcom bubble. But OpenAI and Anthropic could just be today's AOL and Yahoo.
I hope it will also crash hardware pricing so it becomes economically feasible to run your own local model. Currently I don’t like where we are heading with the sabotaging models because its “too dangerous”
> I hope it will also crash hardware pricing so it becomes economically feasible to run your own local model.
Even if you don't acquire hardware to do host local models, a hardware crash means that I should be able to rent the crashed hardware at just above cost of electricity + bandwidth.
Like the way I can now, for $7/m, rent a VPS that can run my B2B webapp for a company with 10k users, I look forward to buying a timeshare on GPUs that let me pay $12/m for all-you-can-eat GPU.
I used to think of a bubble too.
However, I think actually that while it won't give the results expected (AI agents run the company, build all features, etc.), it will nevertheless become a developer tool like IDEs, something "you have to have".
It's here to stay but probably with more realistic expectations than some CEO/CTO are pushing for (agents for everything, nobody writes 1 LOC, self healing systems, etc).
So the market expectations will be probably resized, but these tools are here to stay. Be it for cybersecurity (from CVEs to cyber warfare) alone, that's already worth all the money they are throwing a it.
if cutting prices is the definition of the race then its a race to the bottom.
These moves will only accelerate it.
By socializing the losses of course.
They cut prices and get more customers who are going to move to the next vendor that cuts prices even more or when they jack up the prices again.
I am not complaining, I like my investor subsidised tokens, I don't see what these companies see as their end goal when it's becoming more and more possible to run a competent LLM locally(even with today's RAM prices).
I am surprised that there is no Claude or ChatGPT machine that I could buy, I feel like they should be opening up that model, but I guess subscriptions look better on balance sheets.
OpenAI seems like such a better product these days. I was someone who jumped to anthropic early on for claude code, but I find myself jumping in the other direction these days.
I completely don’t understand Anthropic’s pricing where you have to pay a monthly fee to access their crappy models and pay per use for access to their top model. If you’re going to go pay per use it should be actually pay per use.
This is the race to the bottom setup that will tank these companies in their attempts to IPO. They’re burning cash at current pricing and if a true price war breaks out the only way that ends is if either OpenAI or Anthropic blows up and goes away.
Right now OpenAI is looking like the one setup to fail here. They have lost momentum big time and are looking incredibly vunerable.
LLMs are quickly becoming a commodity. In a decade, the only reason anyone won't be running free models locally will be for corporate oversight or regulatory needs, so the successful providers won't be the ones that make the best product, but the ones that make the most compliant product.
How does OpenAI plan to be profitable?
Economies of scale, optimization of models, hardware, energy infrastructure, data center construction and operation, etc. Stuff is currently relatively inefficient and there's lots of room for optimization. All the usual stuff.
Last man standing monopoly.
> Economies of scale,
"We lose money on every customer, but we'll make it up in volume" :-)
Ads of course
Is that even the plan, or is the plan to make a huge IPO while there's still hype and run off with the money?
I’d upgrade ChatGPT for the family but I have them in an enterprise plan so I can distribute custom GPTs easily and that’s incompatible. Ah well.
No moat!
Think about where any of them will be in 20 years
On device AI?
Lots of comparisons to eg. Amazon, and how both were burning money for ages.
Maybe the better comparison is Uber? I.e. a commoditised product (taxis on an app), burning money to directly subsidise and gain market share. I always thought it was utterly insane and a waste of money... But you'd be hard pressed to have not made money on Uber.
This is my understanding anyway. A LLM-generated summary suggests that anyone who invested pre-IPO got at least 8-10% annually compounded. Even Series G investors made 2.3x since then. It's not an Eldorado and has to make up for all the losers in the VC portfolio but it's money made, not a smouldering crater of losses.
And after going public, return from IPO is 9.4% compounded. Price is 40% below all time high in October 25 but hey that's a harsh criterion for a long term investment.
The reason why I think it's a good point of comparison is that there's no moat, plenty of competition, heavily subsidised for years by literally burning cash, now seemingly profitable and a reasonably sane PE ratio of 17.
Of course one difference is that a major cost item for LLM companies is building genuinely new, cutting edge engineering/science products whereas for Uber, I never understood why they need the 1000s of technical staff to deliver a taxi app.
I don't know about the ins and outs of the business models of either LLM providers or Uber but keen to hear from people who have insights.
It’s all about the equity.
Not sure why people are talking about revenue and profits. Sam & co are about to make ridiculous bank.
i wonder what % of Anthropic's subscriptions is annual vs monthly
We run monthly. It seems like every few months there's a reason to swap in/out of a particular vendor. Specifically I use all 3 pro then either chat or claude will have a 5x max depending if they're the good thing to be using during that those 3 months.
Wasn't this move meant to be delayed until after the IPO?
Yes, please, do it, slash the prices, ideally to zero, immediately!
That way you will loose money even faster and we can finally get ridd of this nonsense even sooner.
I think it is time to start shorting.
Too bad codex still sucks. Anthropic could double their prices tomorrow and I'd probably still pay it.
And openAI's discounts are not enforcing that it is a quality product.
you're not a corporation, corporations are price-sensitive and the decision-makers are typically not the end-users.
Is OpenAI bankrupt yet?
You and I like Claude because it's the best for coding but the normies love chatGPT. ChatGPT has a huge market target.
Nothing to do with price.
Claude actually works - unless OpenAI can do that it would make no difference if it was free.
It works unbelievably well actually - it’s truly amazing.
Codex also works, before Opus 4.8 and Fable it wasn't very clear who had the best agentic model.
OpenAI is full of Trump/MAGA supporters and actively encourages using AI to kill people.
More than happy to watch them lose the global consumer market while they compete with Palantir for DoD contracts.