I've had a lot of bad experience with trying to use AIs to develop my codebase. My experiences so far had been full of failures. But this week I had a... sort-of successful time.

What I really discovered, though, is that is programming with an AI "assisstant" is like having a first-year, over-eager programming student from some alien world armed with a huge library of templates and a lot more time and energy to stitch them together. It's process for producing code isn't based on intent, so what you get is a template you're expected to modify.

The problem is, so many vibe-coders think they can code without understanding what the code does. The problem is, AIs don't understand what the code does either.

The Failures

Let's start with the failures:

  • I tried to get it to convert a Typescript experimental decorator into TC39 format. It failed, dropping out the entire body of the original decorator, leaving me with code that did nothing at all.
  • I was configuring a Typescript plugin, and ClaudeAI hallucinated configuration details that don't exist. It never gave me a straight answer.
  • I asked ClaudeAI to help me find an Emacs library so that I could review, save, and name any recent RegExp search-and-replaces I did. It hallucinated a library that doesn't exist. No such library does what I asked. (Onto the Big Pile: maybe I should write one!)
  • I asked it to help me write configuration files for style-dictionary. The answers were terribly bad, "doesn't pass lint, doesn't look right" bad. It gave meconfiguration details that didn't exist or were misspelled and a suggested code hack that recursed and blew up the stack.

You might notice a trend with all of these: they are obscure. Emacs? TC39 Decorators? Typescript Plugins? Style Dictionary? These are all things currently used (and especially developed by small minorities of developers. The base of blog entries, Github projects, and documentation from which ClaudeAI might derive some statistically valid sampling is very small.

The "Success"

The codebase at work is locked hard into Patternfly 4, which is in no way written to be friendly to Web Components. Web Components are notoriously annoying to style because they lock your organization's style into a component, but they also make it hard for multi-tag HTML components (things like <table> or <form>) to be styled correctly, because the shadowDOM wants to protect its internals. You either end up re-implementing all of the table's components yourself to protect your styling, or you put all the styling into the top-level component and only allow outsiders to add stuff programatically, through a very non-HTML-like interface of properties rather than via attributes. (Which is what we do, and it's okay! That turns tables into data grids, and we're a very data-heavy product.)

Upgrading to Patternfly 5 has been, to quote another Web Components guy I've spoken with, "like trying to get ink out of water with a spoon." But I've got this sort of hybrid effort that's mostly working, and this week I made good progress on some of the basic controls: button, divider, icons, that sort of thing.

My codebase was a mess, as early drafts are meant to be. So I pushed it to ClaudeAI as a project and, component by component, said, "Review this component. Improve the developer documentation, suggest name changes to make the API more readable and comprehensible, provide or improve .test and .story files, and extract and document the part and custom properties of the CSS."

ClaudeAI did... okay. Sometimes good. Often terrible.

When I write fiction, which I do sometimes, I consider myself a miner. I write a ton of stuff, then figure out what the plot and story should be, and re-write it, mining the story as a seam in a big pile of chaff. And when we code, often the correct step is to take an existing codebase and modify it until it does what we want.

ClaudeAI provided the raw ore remarkably well. The first few revisions were laughably bad, but after dialoging with it, going through several iterations, it started giving me web components that matched the template I wanted. I had to fix a lot of things, disagreed with it strongly on some name changes (just because a naming convention is popular, that doesn't mean it's the best one), and still had go 'round time and again with functions that were never used and CSS properties that were never defined.

When I was working on the <divider>, an up-rated version of <hr> that allows for things like placing an icon in the middle of the bar, ClaudeAI "improved" the code base by stealing the entire thing wholesale from Patternfly Elements. Even though I didn't ask for it, ClaudeAI rewrote by Divider class to handle vertical writing systems like Japanese and Chinese, something I did not ask it for.

On the other hand, it did recognize my excessively clever "only put a margin for the central decoration if it's present in the <slot>" with an if statement which, fair. My version was excessively clever, and revealed a gap in my knowledge about how lightDOM and shadowDOM slots interact. That gap has now been fixed with education.

Claude doesn't know anything

In the end, the only reason I have a coherent, consistent, and comprehensive code base is because I intend it to have coherent, consistent, and comprehensive code base. To make use of ClaudeAI to produce a code base other people might want to use, I had to be deeply knowledgable about HTML, CSS, Typescript, and Web Components. ClaudeAI's default attempts were florid, excessive, and broken. I had to edit a lot, but I was grateful to have a lot to edit.

But this is why ClaudeAI and "helpers" like it should be kept as far away from junior developers as possible. ClaudeAI cannot help you if you don't understand what you're trying to do. If you don't have experience and knowledge, you are not qualified to know if ClaudeAI's output is any good. And if you ask ClaudeAI to write the tests but don't know how to audit those, you're basically in a state of test collapse: the tests turn green because they agree with what the AI wrote, not because they match your intent.

And if you can't read the tests, well, then you have no idea if the code base accurately matches your intent.

Claude doesn't intend.

When you're handed a piece of code to review, you think, "Does this do what the other person intends it to do? If not, what mistakes did they make?"

Intent is what gives code conceptual coherence, even if it's broken. The other person had intent, an idea about how the task should be completed, how the code should work. The other person tried.

AIs have no intent. Not even in the video game sense of "That guy intends to shoot me!" Developer give game avatars apparent coherence by putting intent behind those behaviors. The best thing you can say about an AI is that the code creates conversation that seem to have intent, but the developer doesn't even intend that. All the code does, down deep, is stitch together elements of text that stastically match what you type in against a host of prepared contextual statistics that seem to lead to a conclusion.

But the AI didn't intend for that. Nobody did. It's an emergent property of statistics on how humans write stuff.

When you're reviewing AI-provided code, you cannot say, "What as the intent?" Because there was no intent. Without intent, there's no coherence. There are just patterns gathered statistically, with a few random numbers tossed in to give it "lifelike" responses and to give users a chance to say "Try again," roll the dice and maybe come up with a better response a second time.

ClaudeAI Sonnet (the code version of their AI) has fed on a steady diet of GitHub projects. And lots of those projects are half-assed, or tutorials that leave out a lot, or student projects that aren't intended to be production-ready. "Vibe coding" is feeding off of those projects with only minimal intervention (and often minimal experience and education!). To get utility out of an AI you need years of experience in the thing you're asking the AI to help you with. Experience you simply will not get if you depend on AI.

Code-reviewing AI code is infuriating to experienced developers because that intent is lacking. You have to puzzle out "What the hell is this part about?" and sometimes, sometimes, AI can explain what that piece does, but it cannot tell you why it put it there in the first place.

After all, the statistics just said, "Yeah, that might look convincing. But it's not intended to."