Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Chatbot arena leaderboard is a good test for vibes and style of response, but not much else. R1's performance in objective benchmarks (coding, etc.) showed very good performance, granted, but inferior to the full o1 and o1-pro models.

It's still a very impressive feat, but it wasn't frontier-pushing.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: