OpenClaw Studio stage retrospective
In one day I pushed OpenClaw Studio from stage one to stage seven and shipped RC1. It sounds like a sprint. It was closer to defusing mines.
OpenClaw Studio is the local AI working system I'm building. Locally it handles three things: morning check, task routing, and content production. All the heavy external pieces — DeerFlow (a deep-research pipeline), Mission Control Web (the web frontend for Mission Control), Gateway (the external gateway), LLM access, Search — are managed by gates. If a gate isn't unlocked, that piece isn't wired in.
RC1 is not just a version number. To me it's a boundary — inside the boundary, things can run on their own; outside the boundary, they have to honestly write down "not passed." This retrospective is about how that boundary got drawn, step by step, and where it still leaks.
Why I forced all seven stages into one day
I didn't set out to compress all seven stages into one day. I kept doing it and kept noticing: the dependencies between stages are only fresh within a single day. Once a night passes, the state memory of the previous stage gets fuzzy, and the next stage is forced to re-verify. What should have been one confirmation step becomes three.
So I just concentrated the fire — chained all the stages together, PASS in one means the next can begin, and any gate failing in between is an immediate STOP. It sounds like walking a tightrope, but it ran more stably than dragging it out across several days. Attention didn't scatter, the decision path stayed hot, and when something went wrong I could go back and fix the previous step right away.
The price of single-day execution is density — seven stages, more than a dozen reviews, more than twenty dry-runs. The moment you get tired mid-way, it's easy to let things slide, and letting things slide is when accidents happen. So I set myself three rules I would not break. The three below are the actual spine of RC1.
Method one: dry-run first, real run second
Before every stage began, I made the system dry-run first — no file writes, no messages sent, no external state touched. Just walk the flow end to end, and tell me "if this were real, here's what it would do."
The first time dry-run showed up, I treated it as a "confirmation step." Later I realized it's far more than that — it's "making the AI expose its own understanding." After one dry-run, the AI writes out the paths it plans to access, the commands it plans to run, the files it plans to modify, the external interfaces it plans to call. Eight times out of ten, that's where I catch a misunderstanding: it's treating a state file like a draft, it's about to access an external piece that hasn't been unlocked, it's about to write a "passed" conclusion into a position that hasn't been confirmed yet.
If you don't catch these misunderstandings in the dry-run, they become incidents in the real run. So my rule now is — anything that can change external state must pass dry-run first; only then am I allowed to do the real run.
Method two: gate isolation — if it doesn't pass, STOP
The hard part of OpenClaw Studio isn't writing code, it's controlling the boundary. There's plenty that can run locally, but once an external dependency isn't unlocked, it can't be touched — can't pretend to work, can't be skipped and patched later, and definitely can't be released because "it should be fine."
So every external dependency gets a gate. A gate has exactly three states: not passed, passed, pending review. For a not-passed dependency, the whole system treats it as nonexistent; the moment any stage task touches that dependency, it stops and waits for human confirmation.
The gate matrix (the master table of all gate states) is the single most important document in RC1 — more important than any architecture diagram. It's not decoration; it's a real runtime constraint. This time, across seven stages, DeerFlow, Mission Control Web, Gateway, LLM, and Search were all not-passed; the entire external chain was dark. But the local chain, because of gate isolation, could run all the way through.
Method three: zero out-of-bounds writes
The more proactive an Agent is, the more likely it causes trouble. Ask it to read something, and it edits it on the side. Ask it to analyze something, and it starts refactoring. Ask it to check something, and it writes the check result somewhere it shouldn't.
So I signed a "contract" with each agent — explicitly spelling out where it can read, where it can write, and what it absolutely cannot touch. RC1 has 7 such contracts in total: one per agent, three columns (Read / Write / Forbidden), no gray area at all.
The contract itself isn't complicated. The key is that once it's signed, it's actually used as an interceptor. Any write that isn't in the "Write" list is rejected directly by the tool layer, with no room for the agent to maneuver. This move raised RC1's sense of safety a notch above earlier versions — I no longer worry that some agent will, on a whim, modify my reference directory.
Seven stages, each closing a specific risk
The seven stages aren't split by feature; they're split by risk. Each stage solves one class of risk, and only when it fully passes does the next stage begin. The benefit is — whenever something goes wrong, you can pin it precisely to the previous gate, instead of debugging from scratch.
- Stage one: freeze the environment — confirm the local working directory, state files, and tool versions, all under dry-run.
- Stage two: write the agent contracts clearly — where to read, where to write, when it must stop.
- Stage three: lay out all external gates — no external dependency can be triggered in an unconfirmed state.
- Stage four: get the content factory running — five content templates, the registry (the ledger for content), and the review pipeline, all closed-loop locally.
- Stage five: hook up the markdown version of Mission Control — auto-update, auto-backup.
- Stage six: master integration — run a full task using the outputs of the previous five stages together, and look at the coupling points.
- Stage seven: deliver RC1 — freeze the docs, mark the first release candidate.
The audit trail for each stage (including all rerun sub-versions, sub-task nodes, and closeout nodes) lives in the local audit directory. No need to paste it into a public article. What matters is the staging philosophy itself — cut by risk, not by feature.
What RC1 actually delivers
At the end of the day, what's usable on RC1 isn't much — but every item has been verified:
- The local working system runs on its own — morning check, task routing, content production, none of it depends on any external interface.
- The content factory works end to end — registry + five template types + review pipeline, a piece of content can go from draft to review-passed with evidence at every step.
- The markdown version of Mission Control is hooked up — task state updates automatically, backs up automatically, no manual sync needed.
- The 7 agent contracts are in place — every agent's read/write boundary is explicit; no contract, no work.
- The gate matrix is documented — any not-passed external dependency is automatically stopped, no possibility of being quietly let through.
- The role shuffle also landed — primary controller and write-authority moved from "Ying Zheng" (the previous lead agent) to "Zhao Zilong" (the new one), and the other roles were demoted to one of the leads. Once that was written into the contracts, all writes funneled into the single Zhao Zilong agent, and problem tracing got much faster.
What RC1 does not deliver
I want this section to be even clearer — the essence of RC1 is "the part that runs locally," not "the whole system is mature." The following are still unresolved:
- The external systems are all held back by gates — DeerFlow, Mission Control Web, Gateway, LLM access, Search are all still outside the boundary. Meaning any task that requires external capability, RC1 can't take.
- The sync mechanism for the 4 anchors (the key files used for long-term state sync) is still not observable — when local state changes, whether those anchors are actually synced, and when, has no automatic verification today. Drift risk is hanging there.
- 26 secrets still pending cleanup continue to block the reinstall of another machine. RC1 solved the local working system, but didn't solve the hygiene of the whole machine.
- Multiple control planes coexist — local RC1, Mission Control Web, and the future external gateway overlap in responsibility, and who governs whom is not really settled.
- The history entry draft (HISTORY_ENTRY_DRAFT, the draft file that merges this round of engineering output into the long-term history ledger) hasn't been closed — meaning even though this seven-stage run has its audit trail, it hasn't been merged into the long-term history ledger, and a few months from now I'll need to come back and close it.
Writing out "what wasn't delivered" matters more than writing out "what was delivered" — it stops me, months later when I look back at RC1, from treating it as "already done."
The real state of RC1: I'm still working on it
For me, RC1 isn't an endpoint. It's a starting point I can keep walking from.
The local loop working means I have a working substrate I can maintain without depending on the outside — but what it can do is still narrow. Unlocking external gates is a long road, not a one-or-two-week thing; anchor sync, secret cleanup, control-plane closeout, each has to queue up and be done on its own.
My rhythm now: each week, pick one gate and try to push it forward one step — if it can be unlocked, unlock it; if it can't, write "why not" into the audit. Each anchor drift I resolve gets a "verified" mark; each batch of secrets I clean up gets the corresponding position marked "stale" pending the next review. I'm not chasing the RC2 or RC3 version number, but I want RC1's boundary to stay clear — what can run, what can't, and why — three questions that always have answers.
So this is a stage retrospective, not a release celebration. RC1 keeps getting modified offline; even as I finish writing this piece, the next round of gate tests is already queued. The next OpenClaw retrospective will probably start from either "a gate finally unlocked" or "a gate I thought would unlock got pushed back."
RC1 is the version of OpenClaw Studio I'm most satisfied with so far — satisfied not because it's done, but because for the first time it tells me clearly what I can do, what I can't, and where to push next.
Before reaching this point, every "it runs" was an illusion. Real "it runs" comes with gates — local runs, external is governed, writes have contracts, state has audits. If this setup keeps getting polished, RC1 will one day become obsolete, replaced by a version without the RC suffix.
Until then, I'm still working on it.