Link: How StrongDM’s AI team build serious software without even looking at the code
From Simon Willison's blog.
This is an interesting piece of work! A small team built a bunch of software in a short time by leveraging a whole lot of LLM assistance both for building their software and for validating it.
The main things that are new to me in this story are:
- Building digital twins of third-party services like Google Docs that you can test against without rate limits.
- Keeping validation scenarios away from the processes doing the building so that they can be “independently verified” by another agent (kind of like a holdout set in machine learning).
I think there are some interesting things in this writeup (though I'd love to see more details about how they build their digital twins), but I have some fundamental concerns as well.
Primarily, even if you keep your validation scenarios hidden, if they're being done by an LLM they're inherently probabilistic and could be hiding nasty surprises, especially security problems caused by weird adversarial inputs. Even though Claude can write pretty good code, is very fast, and sometimes catches things that I miss, I'm still not comfortable putting my name on software without a closely reviewed and deterministic validation step.
It also sounds like there are some easily avoidable code quality. I think bringing taste and good practices to a repo is an important part of your work as a senior engineer, and something that can pay dividends as agents follow the patterns that you've laid down.