If making money on inference alone was possible, there would be a dozen different smaller providers who'd be taking the open weights models and offering that as service. But it seems that every provider is anchored at $20/month, so you can bet that none of them can go any lower.
> If making money on inference alone was possible, there would be a dozen different smaller providers who'd be taking the open weights models and offering that as service.
There are! Look through the provider list for some open model on https://openrouter.ai . For instance, DeepSeek 3.1 has a dozen providers. It would not make any sense to offer those below cost because you have neither moat nor branding.
Maybe, but arguably a major reason you can't make money on inference right now is that the useful life of models is too short, so you can't amortize the development costs across much time because there is so much investment in the field that everyone is developing new models (shortening useful life in a competitive market) and everyone is simultaneously driving up the costs of inputs needed for developing models (increasing the costs that have to be amortized over the short useful life). Perversely, the AI bubble popping and resolving those issues may make profitability much easier for the survivors that have strong revenue streams.
You need a certain level of batch parallelism to make inference efficient, but you also need enough capacity to handle request floods. Being a small provider is not easy.
Why would the open weights providers need their own tools for agentic workflows when you can just plug their OpenAI-compatible API URL into existing tools?
> when you can just plug their OpenAI-compatible API URL into existing tools?
Only the self-hosting diehards will bother with that. Those that want to compete with Claude Code, Gemini CLI, Codex et caterva will have to provide the whole package and do it a price point that is competitive even with low volumes - which is hard to do because the big LLM providers are all subsidizing their offerings.
Of course they are.
> As long as the inference is not done at a loss.
If making money on inference alone was possible, there would be a dozen different smaller providers who'd be taking the open weights models and offering that as service. But it seems that every provider is anchored at $20/month, so you can bet that none of them can go any lower.