Twelve OpenClaw Copies Later: When Paths and Root Directories Become Risk

Watercolor sketch: a row of documents on a desk, with the authoritative source circled in pencil

OpenClaw path governance notes

One afternoon, with nothing better to do, I typed out a find command just to see how many root directories in my home folder carried the openclaw keyword. A long list slowly scrolled up the screen. I counted: a full twelve.

It wasn't surprise in that moment — more like a daze. I knew clearly which ones were actually running in production — one hosting the virtual company, one hosting Jiyanran (the voice workbench agent) — two in total. The remaining ten each still carried the openclaw name, still had what looked like a complete directory structure, still had config files, still had .env, still had tokens, still had a README I'd written by hand at some point in the past. They were all still there, quietly taking up disk space, not broken, and not in use.

The problem isn't "I accumulated 10 piles of garbage." The problem is that of those 10 copies, several used to be the authoritative root. The moment they were replaced by a new version, they didn't automatically vanish from disk; and I never went back to delete them, because deleting each one requires first confirming "no caller currently points at it," and that confirmation itself is a hassle. So they stayed, and over time the pile grew to twelve.

I used to think that the biggest risk in an AI project was new features failing to run, the model going haywire, or a botched prompt. After doing this long enough, I realized: for a project that's lived past a certain age, the biggest risk is that old copies are still running — you don't know which one is real, the AI doesn't know, and the callers know even less. Every copy looks like the authoritative root, every copy's config looks usable, every token inside every copy is still within its validity window. This isn't a garbage problem; it's a canonical source problem (the authoritative source of truth).

Of 12 root directories, only 2 are authoritative roots; the other 10 are historical remnants

How the copies accumulated

I later sorted the twelve by nature and realized they weren't piled up in one shot — I'd been digging them out shovel by shovel over the past year. Each shovel felt necessary at the time, and each time I never bothered to fill the hole back in. That's how I ended up where I am today.

The top 2 are authoritative roots: one is the production root of the virtual company, hosting approvals, scheduling, agent configuration; the other is Jiyanran's production root, hosting her own MCP (Model Context Protocol) service, voice workbench, and local state. These two are what's actually running, and all valid traffic lands here. They themselves are fine.

On a side note — the "OpenClaw" keyword on my machine covers far more than these 12 roots. If I broaden the search from root directories to all .openclaw* / .clawdbot / .clawai series prefixes, plus all sorts of subdirectories carrying the openclaw keyword, config caches, runtime logs, agent identity files — the whole machine surfaces 13 independent directories and several hundred related files. This time I'm only looking at the root-directory layer, because in governance terms the root is the "foundation" — everything else attaches to it. Once the foundation is sorted, the rest follows.

The problem lies with the 10 below.

The 3rd is a legacy main-config remnant: it used to be the authoritative root, and after being replaced by a new version it just sat there untouched. Its directory name is almost identical to the current production root, just missing a suffix; its .env still holds a token grabbed at some point last year; its scheduler config file is still alive, just unused. This is the most dangerous class of copy, because it most closely resembles the authoritative root.

The 4th is a typo remnant: one day my hand slipped and I typed openclaw as openclag. Hitting enter created an empty directory, and later I casually stuffed a few test files inside. This kind of remnant is the easiest to identify because the name itself is wrong; but it's also the easiest to overlook, because I myself had forgotten it existed.

The 5th is an experiment copy, sitting under some third-party platform's projects directory. I ran extension experiments on that platform for a while and copied the entire openclaw directory over for adaptation testing. When the experiment ended, the whole copy stayed put, along with its own config carrying real tokens.

The 6th is an audit copy. During one audit, to avoid polluting the production root, I copied the entire directory to an isolated folder on the Desktop to run audit scripts. The scripts ran, the audit report came out, but nobody reminded me to delete that copied-out copy, so it stayed too.

The 7th through 10th are 4 historical versions, each named with a version number — the v3_openclaw_agents format. They're "in case of rollback" backups I left behind during major version switches; after the switches stabilized, I never went back to clean them up. Each is still in place, with clear directory names, complete contents, and never opened again.

The 11th is an archive copy, buried deep inside an annual archive directory. During one disk cleanup I moved the entire openclaw main directory over wholesale. The point was to free up space, but after moving it I went and rebuilt a fresh set in the original location — because I "didn't trust the archive directory" — and so the archive zone also ended up with a complete copy.

The 12th is a compressed backup, a tar.gz file buried inside some cleanup-backup directory. I can't even remember when I packed this one; the filename carries a date, and opening it reveals a complete snapshot of some version from last year. The most annoying thing about tar.gz files is that, unlike directories, they don't show up in an ordinary find — you have to actively add a *.tar.gz pattern to scan for them. In other words, halfway through governing a project, the easiest thing to miss is this kind of compressed-state copy. It has no directory structure, it doesn't show up in ls, and it only gets remembered during some disk cleanup.

Counting through these 10 one by one, I realized: every class of copy had a reasonable rationale at the time. Backup, audit, experiment, rollback, archive, fat-finger — none of the rationales is wrong. What was wrong is that every time I left a copy behind, I never bothered to do one thing — tag it, stating what kind it is, when it expires, and who's responsible for cleaning it. So they stayed in the posture of "I'll deal with this later," and "later" never came.

The copy isn't dangerous — the copy used as authoritative is

If these 10 copies were just quietly occupying disk, the problem really wouldn't be much. Accumulating 10 old directories in a year doesn't amount to much for a 4TB workstation.

The real risk is on the other side: the copy is being used as the authoritative root.

I just ran into two such cases recently. One is Jiyanran's MCP path — the config still points at the legacy .openclaw, that is, the 3rd one, the legacy main-config remnant. That path has been deprecated for a while, but because it was inherited from an old config and nobody re-verified it when the new production root went live, the MCP calls were still routing through the old path. On the surface everything looked normal, because the config file in the old directory was still there and still loadable; in reality what it was reading was expired agent settings. This is the classic shape of "a copy being mistakenly used by a new caller."

The other is the virtual company's scheduler. It does cross-root scheduling across Jiyanran and shared tools, meaning the scheduler in the company's root directory will reach into another root directory to trigger tasks. That's neutral by itself — cross-root scheduling isn't necessarily wrong — but the problem is that several of the paths the scheduler walks mix in paths from old copies. In other words, the company-side scheduler might, in some cases, be calling scripts inside a copy rather than the latest scripts in the authoritative root. This is the classic shape of "copies cross-referencing each other."

Neither incident caused an outage, but both let me see one thing: a copy being used as authoritative is a silent failure mode. It doesn't blow up one morning — it surfaces slowly as small puzzles: "I changed the setting, why didn't it take effect"; "I fixed the bug, why is it still reproducing"; "I rotated the token, why is the old token still working." Each puzzle in isolation isn't fatal; strung together they form a curve of governance running out of control.

Worse still, the AI can't tell either. When I send an AI agent off to do something, it follows the paths in the config file — it doesn't know which path is the authoritative root and which is a copy. As long as the path is valid, the directory exists, and the file is readable, it treats the material as real. At that point, the more obedient the AI, the wider the copy contamination spreads. The AI isn't the source of the problem, but it will faithfully amplify it.

There's an even more insidious shape: copies calling each other. When the home directory simultaneously holds multiple sets of openclaw — each of them having been the authoritative root at some point — each carries the dependency graph of its own era. Today I open a script in the 5th copy and it might import a tool from the 3rd copy; I open a config in the 7th copy and it might point at a template file in the 11th copy. These cross-references aren't something I designed; they're a side effect of historical layers stacking up. Every time I switched to a new version, I tried to cut it cleanly from the old, but "tried" isn't "completely" — there are always a few edges left dangling. Those few dangling edges, looking back two years later, form a relationship diagram that makes your scalp tingle.

Five typical risks copies bring

Spreading out all the trouble I've had with these 12 copies over the past year, it groups into five categories of risk.

The first is the authoritative source being unclear. Which one is the authoritative root? A human can't always remember, the AI doesn't know, and callers only look at the path, not the semantics. When the home directory simultaneously holds 12 roots carrying the same keyword, "real" stops being the default state and requires active reconfirmation every time. The heaviest cognitive load on a project that's lived this long isn't new features — it's "is what I'm touching right now the authoritative root."

The second is credential proliferation. Every copy might have carried tokens, keys, .env files. They were once legitimate, once granted permissions, once actually usable. When the copy stays behind, the credentials stay with it. You think you only have 2 sets of production credentials to manage; in reality you have 12. The day some old token gets leaked, the trace will point back to some archive directory you've long forgotten — and the difficulty of fixing that kind of incident far exceeds "a live service has a bug."

The third is legacy dependency creep. The code inside copies still references some services that have already gone dead — an internal bridge port that was decommissioned long ago, a local SearXNG instance that no longer exists. These references don't normally appear on production paths, but the moment any single call drifts into a copy, it hits the dead dependency immediately. Error logs surface port numbers you can no longer remember the purpose of, and the debug chain has to be traced back half a year.

The fourth is audit invalidation. This is something I only recently worked out clearly. If the audit was run against a copy, the conclusion can't represent production; but if the audit didn't make clear which set it was running against, the conclusion produced still looks like a production conclusion. Audits exist so I can have more confidence in the system — but if the audit's starting point is a copy, it actually makes me more confident in the wrong state. That's the worst feedback direction in governance theory.

The fifth is exponential maintenance burden. The more copies there are, the more each governance action (rotating a key, upgrading a dependency, changing a path convention) has to be multiplied by the number of copies. An upgrade on the 2 authoritative roots gets done in one afternoon; running through all 12 takes a full week, and you also have to judge one by one "should this set follow, and if not, will it leave a hidden risk." Maintenance burden doesn't grow linearly — it grows exponentially, because every copy has some implicit reference relationship with several others.

These five categories share one trait: none of them is technical risk; they're governance risk. Technical risk can be dissolved by writing better code; governance risk can only be solved by discipline — assigning every piece of material responsibility, boundaries, and a lifecycle. I used to spend ten times more time on the former than on the latter. Only this year did I realize that the hidden cost of governance far exceeds any single point of technical debt.

My handling: identify, stop the bleed, migrate, archive, delete

The easiest reaction is "fine, just delete them all in one go." I started out wanting to do exactly that, but quickly talked myself out of it — because I knew that once I really deleted them, I'd never have another chance to trace anything. Deleting everything outright converts a governance problem into a data-loss problem; it burns the ledger before the books are balanced.

So what I'm doing now is five steps: identify, stop the bleed, migrate, archive, delete. Each step has its own boundary; don't skip.

Step one is identify. Scan out every copy and tag each one: authoritative, legacy, experiment, audit-copy, archive. Tags aren't decoration; they're responsibility — once you slap on legacy, it means "this set is being phased out; new code may no longer reference it." Once you slap on archive, it means "this set is read-only; nobody writes into it anymore."

Step two is stop the bleed. Confirm that no new code still references copy paths. This step means grep through the codebase, grep through config files, grep through LaunchAgent (macOS background service) and plist files, grep through cron. Anywhere still referencing a copy needs to be listed. Migrating without stopping the bleed is like rerouting a road while traffic is still running on the old one — you finish rerouting only to discover half the convoy is still missing from the new road.

Step three is migrate. Change every reference still pointing at a copy to point at the authoritative root. This step needs careful editing — run a regression after each change, especially for the core entry points: MCP paths, scheduler config, bridge calls. Once done, run a smoke test: trigger a full flow from the top-level entry and check that every call correctly lands on the authoritative root.

Step four is archive. After confirming zero references, move the copy to the archive zone. The archive zone is read-only, timestamped, and completely isolated from production paths. After moving, leave a README in the original location stating where this set went, when, and why. You can't just delete the original location — deleting outright will leave both the AI and me without leads.

Step five is delete. After 3 months in the archive zone, with no person and no caller having come looking for it, then really delete. The 3 months isn't a guess — it's something I observed from my own work patterns: a dead reference will, at the latest, be triggered within 3 months by some regression test, some audit, or some "huh, where did that thing go" question. Three months without incident is basically proof that it's truly no longer needed.

Five steps in, slow, but the irreversible action is at the end. The first four steps are all reversible — tags can be changed, references can be rolled back, archives can be moved back. Only step five, delete, is irreversible, so it has to come only after the previous four steps have all been completed and time has proven it out.

These five steps also have a hidden design: they force me to separate "governance intent" from "governance action." Identify and tag is intent — I first declare "this is how I plan to dispose of this set," then use the next four steps to turn intent into action. The benefit of separating intent and action is that the AI can also come in and help. I can have the AI grep references, have the AI run regressions, have the AI move things to archive — but only after the tags are set. Tagging is a human responsibility, not the AI's. The AI can't decide for me "does this count as legacy or archive," but once I've decided, the AI can handle most of the execution.

Why I don't just delete everything: copies are evidence, not garbage

I later realized that my attitude toward these 12 copies determines how I understand the entire project.

If I see them as garbage, the answer is simple — one rm -rf wraps it up, frees up tens of GB of disk space, makes the desktop tidy. But if I see them as evidence, things look completely different. Every copy is the material evidence of a stretch of history: the 3rd tells me "this is what the production root used to look like"; the 5th tells me "I once ran this kind of experiment on a third-party platform"; the 7th through 10th tell me "I once did 3 major version switches, each leaving behind a complete rollback snapshot"; the 12th tells me "one day last year I was uneasy enough about the system state to pack a complete tarball."

What is this evidence good for? It's useful in three scenarios.

The first is tracing. The day you find a strange token being used in the wild and need to go back to what project that token was provisioned for, which version it was generated in — the copy is the only source that can answer that question. The production root rotated this token long ago; it has no memory of its past.

The second is rollback. After a new version launches, some agent's behavior gets weird — was it introduced by the new version? The complete old-version state preserved in the copy can be pulled out for A/B comparison, and you can localize in minutes. If all the copies were cleanly deleted, A/B comparison would require rebuilding the runtime environment from git history — that's days of work.

The third is audit credibility. External audits often ask "when did you start doing it this way," "what did the previous version's design look like." This kind of question can't be answered from memory; it requires material evidence. A timestamped archive copy is the cleanest answer in audit terms.

So deleting everything outright is essentially trading space for time — trading a tidy disk now for being struck mute in future tracing, rollback, and audits. That trade looks worthwhile when the home directory is small; it stops looking worthwhile when the project has lived past a certain age.

Categorized retention is what real governance looks like. It requires me to admit one thing: copies and production roots are two different kinds of entity, and they need different treatment. The production root needs to keep traffic flowing, needs ongoing maintenance, needs strict control over who can change it; the copy needs to be tagged, bled out, frozen, and retired at the right moment. Looked at together, copies are garbage; looked at separately, copies are the project's own archaeological layers.

Where I am now: half tagged, not a single one really deleted

Writing all these principles down is easy; doing them is slow.

As of today, the status of these 12 copies is this: of the 10 outside the 2 production roots, I've tagged 5 as legacy (the legacy main-config remnant, the typo remnant, 3 historical versions), 3 as archive-only (the annual archive copy, the compressed backup, and 1 of the historical versions worth retaining because of the size of its changes), and 2 are still in review (the experiment copy and the audit copy, which need confirmation that all their tokens have been fully rotated).

Stopping the bleed is only half done. I've grep'd through the codebase and grep'd through config files, but the LaunchAgent and plist layer hasn't been cleanly scanned yet — that's the easiest layer to miss on macOS, because they hide under both user and system directories, and the naming conventions aren't unified.

The migration effort is still queued. I already know about Jiyanran's MCP path pointing at the old .openclaw, but the owner hasn't cleared the change — because this change requires restarting the MCP service, redoing a regression, and first taking a complete snapshot of MCP's current runtime state before the change. The cross-root scheduling issue with the company scheduler is more complex: I first need to list every cross-root call, then decide which to keep and which to pull back. Neither of these is something an afternoon can resolve; they need to be scheduled into the engineering rhythm of the next few weeks.

The archive zone isn't built yet. Right now I've only tagged the few copies judged archive-only, but haven't actually moved them into an independent, read-only, timestamped archive directory. That step has to wait until both stopping the bleed and migration are done — because once the move is finished, only a README remains in the original location, and if any caller is still pointing at the original location, that caller will fail outright.

Really delete: not a single one done. The earliest copy that could enter the "really delete" workflow, on a 3-month observation clock, won't come up until autumn.

I'm not anxious about this pace. 12 copies accumulated over a year can't possibly be cleaned in a week — and shouldn't be. If I cleaned them in a week, it would mean I skipped some steps; and skipped steps will eventually come back to find me in some other form.

Through this whole process I kept coming back to one line: a project lives past a certain age, and paths become risk.

A new feature failing to run is visible risk; old copies still running is invisible risk. Visible risk forces you to solve it; invisible risk indulges your procrastination. I'm no longer chasing a clean home directory — what I'm chasing is a home directory where I can articulate the nature of every copy, the ownership of every reference, and the lifecycle of every credential. The former is just tidy; the latter is real governance. These 12 copies, I'm still slowly closing them out.