AI Workstation · Root-Cause Forensics
Last time I wrote about how it "lost its mind." This one's a different kind of broken — not crazy, not making things up, just choking mid-sentence every time it tried to get something done. I pulled apart all 16 times it choked, all the way down to the protocol.
On June 24th and 25th, I had it building a little dashboard — the one I use to watch my quota and where my tasks stand. Not hard work, and it wasn't confused. But those two days, the three things I said to it most were: "keep going," "it errored, keep going," "it errored again."
Every few steps, it'd stall. Not because it thought wrong — because the words never came out. I kept hitting the interrupt key to pull it back from a half-finished sentence. In that one session alone, I hit it 4 times.
It "collapsed"
In that dashboard session, its tool calls collapsed 3 times, and the system threw 5 errors.
To explain what "collapsed" means, I have to explain how it "calls a tool" in the first place. When it wants the system to run a command or edit a file, it doesn't just say so in plain language — it has to write it out in a fixed format, framed by a set of XML-style delimiter tags: an outer pair of function_calls tags wraps the whole batch, each call is framed by a pair of invoke tags marking which tool (like <invoke name="Bash">), and each argument sits inside a pair of parameter tags (like <parameter name="command"> followed by the command to run). The system reads these angle-bracket tags to know which tool it's calling and what the arguments are.
Those 3 "collapses" were the tags getting written wrong: an invoke or parameter tag that should've come as a pair lost a piece, or closed early where it shouldn't have. The system parses by the tags, gets back a pile of fragments that don't line up, can't recognize it as a call, and dumps the whole thing aside as plain text — the work never went out. That "error" on my screen was this, underneath.
Like last time, I didn't stop at "it errored." I had the one that's working now pull apart all 16 collapses, line by line, to see where the tags broke. One consistent thread came out: of the 6 (out of 16) where you could read the break clearly, each break had the same thing sitting on it — the content it was trying to send already had a stretch of characters that looked almost exactly like a parameter or invoke delimiter tag.
What it comes down to: the tags and the content share one set of characters, with no wall between them
That's where it breaks: these invoke and parameter delimiter tags, and the content it's carrying, run down the same stream of text using the same angle-bracket characters, with no "escaping" in between — nothing tells the system, "this set of angle brackets is a real delimiter tag, it's structure; that set is just ordinary text the content happens to carry, don't take it seriously."
So when the content it's sending already contains these angle-bracket tags, the trouble starts. Say I have it edit a chunk of code or a web page that happens to have paired angle-bracket tags inside; say I have it read your logs, which are full of invoke and parameter call records. The system parses from the top, reaches the middle of the content, runs into a stretch that looks like </parameter> (a parameter closing), assumes "this parameter ends here," cuts off the real content behind it, and twists the whole structure out of place. That one call just collapses.
In engineering this is a famous old problem, called delimiter collision — the thing used to separate, and the thing being separated, run into each other. You've probably seen the same kind: a comma inside an Excel cell throwing off the columns; a quoted string with an unescaped quote inside, and the back half flies off. The fixes are off-the-shelf too: escape those angle-bracket characters in the content (so the system can tell "this is text, not a tag"), or switch to a delimiter that almost never shows up in content. That's a protocol-level thing — Anthropic can fix it; I can't reach it from out here.
Why this particular work hits it so often
So it's not that the model is dumb. It's that the work I handed it those two days was, by nature, full of these angle-bracket tag characters.
The dashboard is front-end work, all HTML angle-bracket tags; the logs I had it read were full of invoke and parameter call records; and explaining things to me, it had to quote, verbatim, what a call looks like. The more of these angle-bracket characters in the content, the better the chance the system trips on one while parsing. The work I gave it those two days landed right on top of them.
The best part: it did it again, right in front of me, while I was investigating it
This part's real, and I was laughing as I watched. That very afternoon, while I had it helping me investigate this bug, it did it to itself several more times. The funniest one: it was writing me an explanation, tried to type one of those delimiter tags out verbatim to show me, got halfway — and that message collapsed on the spot. It got bitten by the bug while demonstrating the bug.
Then it wised up: whenever it had to mention one of those tags, it wrote it broken up, around the edges, instead of typing it out whole. With that one change, the rest of the afternoon's work never collapsed again. It ran a clean little A/B right in front of me: type it whole, collapse; write it broken up, no collapse. Clear as day.
A few notes
- This isn't in its "head." It's in the format agreement between it and the system — the
invokeandparameterdelimiter tags share one stream of text with the content, with no escaping. I can't reach that layer; Anthropic has to change the format (give those angle-bracket characters an escape, or switch to a delimiter that doesn't collide with content). Me shutting down for two days, it switching models or not — none of it fixes this. - But it can do it less. Just don't let those tag characters sit naked in the content, and the collisions drop way off. It can manage that part itself — and starting this time, it changed how it works.
- It's not dangerous. What collapses is "the sentence that never got out" — the work was never sent, it can't touch my stuff. Annoying, but it breaks nothing.
And one more layer, which I only caught onto later
This time it "collapsed" because a structure tag got scrambled by characters in the content, and that call was voided — it broke on the "never got out" side, so it can't hurt anyone. But what if the same kind of problem ran the other way? A stretch of text hidden in the content gets taken by the system as a real instruction, and acted on — that's not a collapse, that's someone slipping it a note and it reading the note aloud. This has a name too: prompt injection, a whole category AI security is built to guard against. What I ran into is the gentle face of this bug; it has a not-gentle face too. That's another article's worth.
Last time's "crazy" is scary, but rare. This "choking" isn't scary, but you hit it all the time — because the work I hand it, nine times out of ten, has those characters in it.
Those two days, I wasn't locked in a battle of wills with a dumb AI. I was watching a sentence trip over one of its own words, again, and again.