Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My first reaction: wow, incredible.

My second reaction: still incredible, but noting that a C compiler is one of the most rigorously specified pieces of software out there. The spec is precise, the expected behavior is well-defined, and test cases are unambiguous.

I'm curious how well this translates to the kind of work most of us do day-to-day where requirements are fuzzy, many edge cases are discovered on the go, and what we want to build is a moving target.

 help



> C compiler is one of the most rigorously specified pieces of software out there

/me Laughs in "unspecified behavior."


There's undefined behavior, which is quite well specified. What do you mean by unspecified behavior? Do you have an example?


Undefined is absolutely clear in the spec.

Unspecified is whatever you want it to mean. I am also laughing, having never heard "unspecified" before.


Unspecified behaviour is defined in the glossary at the start of the spec and the term "unspecified" appears over a hundred times...

The C spec is certainly not formal or precise.

https://www.ralfj.de/blog/2020/12/14/provenance.html

Another example is that it's unclear from the standard if you can write malloc() in C.


Sure but the point OP is making is that it is still more spec'd than most real world problems

You're welcome to try writing a C compiler and standard library doing no research other than reading the spec.

> My second reaction:

This is the key: the more you constrain the LLM, the better it will perform. At least that's my experience with Claude. When working with existing code, the better the code to begin with, the better Claude performs, while if the code has issues then Claude can end up spinning its wheels.


Yes I think any codegen with a lot of tests and verification is more about “fitting” to the tests. Like fitting an ML model. It’s model training, not coding.

But a lot of programming we discover correctness as we go, one reason humans don’t completely exit the loop. We need to see and build tests as we go, giving them particular care and attention to ensure they test what matters.


The agent can obviously do that



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: