Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The paper is pretty vacuous IMO but there are at least a few reasons I think LLM testing is pretty nice:

* It’s actually easier to do TDD or black box testing with LLMs. Yes, the lazy approach is to feed it a function implementation and tell it to make a unit test. But you can instead feed it the function definition and a description of its behavior (which may be what you used to generate the implementation too!) and have it generate a unit test with no visibility to the spec.

* Unit tests tend to have a lot of boilerplate sometimes, often not copy-pastable (eg Go table test cases) and LLMs can knock that out super quickly.

* Sometimes you do actually want to add a ton of unit tests even if they’re a little too implementation-focused. It’s a nice step towards later having actually-good tests, and some projects are so poorly tested and plagued with basic breakages/bugs that it’s worth slowing down feature development to keep things stable.

Personally I hate when people try to automate this stuff though, because it does trend towards junk. I find it better to treat writing tests with LLMs tactically, basically the same way you use them to write code.



>Unit tests tend to have a lot of boilerplate sometimes, often not copy-pastable

When people use LLMs to write code and they find if helpful, invariably it is because they are spewing boilerplate.

If you dont systematically eliminate boilerplate the codebase eventually turns into an unmaintainable mess.

>Sometimes you do actually want to add a ton of unit tests even if they’re a little too implementation-focused.

Really? Id consider this an antipattern.

>I find it better to treat writing tests with LLMs tactically

I find the prospect of using them to write production code / tests pretty depressing.

The best thing that can be said is that they will create lots of jobs with the mess they make.


Unit tests and implementation for something like "parse this well-defined file format" are perfect for AI, low-scope, clear success criteria. Plenty of production code I write is more like "parse this well-defined file format".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: