> They're not paying me to use it. Of course they are. > As long as the inferenc...

FeepingCreature · 2025-12-02T23:05:58 1764716758

> If making money on inference alone was possible, there would be a dozen different smaller providers who'd be taking the open weights models and offering that as service.

There are! Look through the provider list for some open model on https://openrouter.ai . For instance, DeepSeek 3.1 has a dozen providers. It would not make any sense to offer those below cost because you have neither moat nor branding.

dragonwriter · 2025-12-02T21:43:37 1764711817

> If making money on inference alone was possible

Maybe, but arguably a major reason you can't make money on inference right now is that the useful life of models is too short, so you can't amortize the development costs across much time because there is so much investment in the field that everyone is developing new models (shortening useful life in a competitive market) and everyone is simultaneously driving up the costs of inputs needed for developing models (increasing the costs that have to be amortized over the short useful life). Perversely, the AI bubble popping and resolving those issues may make profitability much easier for the survivors that have strong revenue streams.

threeducks · 2025-12-03T11:02:42 1764759762

You need a certain level of batch parallelism to make inference efficient, but you also need enough capacity to handle request floods. Being a small provider is not easy.

HDThoreaun · 2025-12-03T03:51:52 1764733912

The open models suck. AWS hosts them for less than closed models cost but no ones uses them, because they suck.

rglullis · 2025-12-03T10:01:35 1764756095

It's not the open models that suck, it's the infrastructure around them. None of current "open weights providers" have:

   - good tools for agentic workflows
   - no tools for context management
   - infrastructure for input token caching

These are solvable without having to pay anything to OpenAI/Anthropic/Google.

threeducks · 2025-12-03T11:18:47 1764760727

Why would the open weights providers need their own tools for agentic workflows when you can just plug their OpenAI-compatible API URL into existing tools?

Also, there are many providers of open source models with caching (Moonshot AI, Groq, DeepSeek, FireWorks AI, MiniMax): https://openrouter.ai/docs/guides/best-practices/prompt-cach...

rglullis · 2025-12-03T12:46:24 1764765984

> when you can just plug their OpenAI-compatible API URL into existing tools?

Only the self-hosting diehards will bother with that. Those that want to compete with Claude Code, Gemini CLI, Codex et caterva will have to provide the whole package and do it a price point that is competitive even with low volumes - which is hard to do because the big LLM providers are all subsidizing their offerings.

rprend · 2025-12-03T01:33:07 1764725587

They do make money on inference.