So.. what about AI?
Note: Everything I say here is my own opinion, not a statement of fact.
It’s now been a couple of months since I changed teams: I stopped working on the C++ analyzer and started working on something more AI-oriented (hardly shocking, seeing how all companies are positioning themselves nowadays).
The idea is to “help agents” do better work, not only by finding issues when scanning code in CI, but also by “shifting left” into the agent itself, similar to how SonarQube works in IDEs.
So it kind of makes sense to use AI daily (which I wasn’t doing on the CFamily analyzer) to identify gaps, common failure modes, etc.
I am not even going to try to predict what the future of the software developer’s role will look like, I have no idea where we are on the curve.
While I was on CFamily I was considerably more conservative about what this technology can do, maybe because models are not that good at C++ or maybe because I know the language better. But I have to admit I’ve been surprised by their capabilities in languages such as Python or JavaScript. Of course, I know considerably less about Python and even less about JavaScript, so that may explain it. They definitely help me put together a report-generation tool combining Python, JavaScript, and HTML in a matter of minutes, something that, given my so-so knowledge of the latter two, would have taken me hours, probably days.
That said, I also have the impression that, left to their own devices, agents will create a maintainability nightmare. If you give them one task, they tend to be surprisingly good at it. When you need to iterate on the initial requirement, they can get lost in their own maze of code: I’ve seen them iterate like headless chickens trying to fit what should have been a trivial algorithmic change. I had to remove everything and explain in more detail how to split the initial requirement so it could be more easily extended later. Sure enough, when I then “prompted” for the new requirement, it could make the change in seconds with little effort because the foundation was more solid and easier to build upon.
I struggle to see how people claim to let agents run for hours. I hardly let them run for more than five minutes because they tend to be in very bad shape if I let them run longer. As I said, if I don’t lay the groundwork for a better design upfront, the code becomes a nightmare of branches and loops, usually hard even for an LLM to grasp when it has to be modified. Not to mention logical duplicates sprinkled everywhere.
What I have found to work much better is using the Plan mode, usually with a beefier model such as Opus, to define a plan of action that I can read, clarify, and iterate. Once the plan looks good enough, I let another model - Sonnet, normally, and maybe not even the thinking model - follow the steps. Usually this results in a more decent set of changes. And yes, I look at the diffs - maybe not in extreme detail, but I do - and I let the agent know when I disagree with a change.
I won’t deny there’s a productivity increase - since I don’t have to type the code line by line. But it’s hardly a 10x increase, not if you care even a little about what’s being done. Even less so if you need to wait for PR reviews or validate what you’re doing (e.g., run experiments). And don’t get me started on priority changes. Dumping lines of code, as many have said, was not the productivity bottleneck. Personally, I find the improvement to be maybe in the low double-digit percent range.
Honestly, though, for one-offs it is indeed amazing.
Lastly, I sometimes miss “typing” the code myself. I fear I’m letting my brain become slower and duller in this respect. I may be risking that I will stop paying attention or even lose the ability to spot bad “decisions” made by the agent.
Is the percentage increase I’m seeing now sustainable? Will I just adapt, or will I become dumber and lose the productivity gains? Will the models become good enough that this won’t matter? Will I be unemployed and unemployable, or will software engineers still be needed? I have no clue.
I lean toward this being a useful tool - no arguing with that - and toward it not going away anytime soon. It allows you to talk to a computer in natural language, and I think its ability to deal with fuzzy tasks is quite valuable. I doubt it can do everything people claim it can, especially since those who tend to make such claims often have money riding on them.
But time will tell. For the moment, I’m in the middle camp: I don’t think it’s BS, nor do I think it’s all-powerful.
We’ll see how this post ages.