Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The agent orchestration point from vessenes is interesting - using faster, smaller models for routine tasks while reserving frontier models for complex reasoning.

In practice, I've found the economics work like this:

1. Code generation (boilerplate, tests, migrations) - smaller models are fine, and latency matters more than peak capability 2. Architecture decisions, debugging subtle issues - worth the cost of frontier models 3. Refactoring existing code - the model needs to "understand" before changing, so context and reasoning matter more

The 3B active parameters claim is the key unlock here. If this actually runs well on consumer hardware with reasonable context windows, it becomes the obvious choice for category 1 tasks. The question is whether the SWE-Bench numbers hold up for real-world "agent turn" scenarios where you're doing hundreds of small operations.





I find it really surprising that you’re fine with low end models for coding - I went through a lot of open-weights models, local and "local", and I consistently found the results underwhelming. The glm-4.7 was the smallest model I found to be somewhat reliable, but that’s a sizable 350b and stretches the definition of local-as-in-at-home.

You're replying to a bot, fyi :)

Nope! https://www.linkedin.com/in/philipsorensen

But as a non-native english speaker, I do use AI to help me formulate my thoughts more clearly. Maybe this is off putting? :)


Yes, that's definitely a bad idea because the community picks up on it and dismisses the entire comment set as generated. Generated comments aren't allowed on HN, and readers are super-sensitive about this these days.

The non-native speaker point is understandable, of course, but you're much better off writing in your own voice, even if a few mistakes sneak in (who cares, that's fine!). Non-native speakers are more than welcome on HN.

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...


Comment 1: https://news.ycombinator.com/item?id=46873799 2026-02-03T17:12:55 1770138775

Comment 2: https://news.ycombinator.com/item?id=46873809 2026-02-03T17:13:40 1770138820

Comment 3: https://news.ycombinator.com/item?id=46873820 2026-02-03T17:14:25 1770138865

All detailed comments in different threads posted exactly 45 seconds apart, unless the HN timestamps aren't accurate.

That's very impressive if the account is not "generated comments", even using speech-to-text via AI. I'll leave it at that.


Appreciate it! I should clarify that it's not just grammatical. I find that AI can sometimes help me articulate ideas based on my thoughts in ways that I hadn't even considered.

Ok, but please don't do it anymore. It's not what we want here, will lead to an increasingly hostile reception from HN users. The community here feels very strongly about reserving the space for human-to-human interaction, discussion, thought, language, etc.

If it weren't for the single em-dash (really an en-dash, used as if it were an em-dash), how am I supposed to know that?

And at the end of the day, does it matter?


Some people reply for their own happiness, some reply to communicate with another person. The AI won't remember or care about the reply.

"Is they key unlock here"

Yeah, that hits different.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: