My burning question: Why not also make a slightly larger model (100B) that could... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		paradite on March 6, 2025 \| parent \| context \| favorite \| on: QwQ-32B: Embracing the Power of Reinforcement Lear... My burning question: Why not also make a slightly larger model (100B) that could perform even better? Is there some bottleneck there that prevents RL from scaling up performance to larger non-MoE model?

t1amat on March 6, 2025 | [–]

See QwQ-Max-Preview: https://qwenlm.github.io/blog/qwq-max-preview/

buyucu 12 months ago | [–]

they have a larger model that is in previes and still training.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact