For trivial stuff, it might be able to even now (but those can probably be solve...

For trivial stuff, it might be able to even now (but those can probably be solved algorithmically as well). For more complex stuff, they are not even scratching the surface, I would believe — LLMs can’t really do long, complex thoughts/inferences, which are essential for coming up with proofs, or even just to solve a sudoku (which they can’t do — no, writing a program (which was likely part of its training set) and executing that doesn’t count).