My Knowledge Model

Watercolor sketch: notebooks scattered beside construction scaffolding, with a folded 'knowledge map' next to them

Notes on the Knowledge Model

What I eventually figured out: the dangerous thing in an AI project isn't forgetting — it's remembering wrong and continuing to build on it.

I used to think of long-term knowledge as something close to "archives." Save the chats, save the terminal output, save the reports, save the screenshots — and the next AI would be able to pick it up.

The real situation isn't like that. An AI project's working ground isn't a library — it's more like a construction site. Today Claude is talking direction in one window, Codex is editing files locally, Kimi is digesting a long document, the browser has a deploy page, a preview page, and a ChatGPT conversation open. Every spot is producing material: a sentence of judgment, a chunk of logs, a report, a screenshot, a status that just barely passed.

On the day, I obviously know which is a draft, which is in-between, which PASS came with a warning. The problem is that a few days later the site is dispersed and only the materials remain. When the next AI walks in, what it sees isn't a "site" — it's a pile of fragments that all look like evidence.

Without compression at this point, the system doesn't get smarter — it just gets dirtier.

Compressing chat, outputs, and reports from the AI construction site into long-term knowledge

The real conflict: more material, less stable judgment

I first assumed the problem was "the AI forgot." Later I found the more common problem was "the AI remembered wrong."

Say a line of work first gets scored 95, then an independent audit knocks it back to 77, and later the gaps get closed and it comes back to 96. All three numbers are real, but they can't be tossed together for direct use. If the next AI only sees 95, it might push toward launch; if it only sees 77, it might re-fix things that were closed; if it only sees 96, it might forget the 96 is still a local controlled candidate, not a production release.

This isn't a memory-quantity problem. It's a judgment-structure problem. When material hasn't been compressed into "currently-usable conclusions," it becomes a contamination source in the next round of collaboration.

The impact is concrete. Every time the window changes, the AI changes, or the day changes, I have to re-explain: that report is stale, the other one is just an owner report — not an independent audit; this PASS is local-only, not a launch authorization; that conclusion was just to keep the discussion moving and got overturned afterward.

Wasted time is just the surface. The more serious thing is: old judgments revive, in-between states get misread as completed, and the AI keeps writing plans on top of wrong premises. The smoother it writes, the easier it pulls me along.

A wrong fix I tried: save everything

The most natural reaction is to add memory. Afraid of losing things — save them all. Save chat, save logs, save reports, ideally make every sentence searchable afterward.

But this path turns and bites you fast. Full-volume saving solves "do we have the material," not "is this material still usable now." A vector store can recall similar content, but it doesn't automatically know which sentence was probing, which report has been overturned by audit, which conclusion only applies to a local preview.

I stopped chasing "let the AI remember more." I started caring about something else: let it misuse less.

That's where my Knowledge Model came from. It's not a new term, and it's not another knowledge-base tool. It's a set of site end-of-day rules: among the materials produced each day, which ones stay as evidence only, which enter the current state, which become long-term rules, which should be thrown out outright.

My method: four questions decide whether a piece of material gets promoted

Now I don't put a piece into long-term memory just because it "looks useful." It has to pass four questions:

One, what's the problem. Not vaguely "lacked context," but spelling out which judgment was wrong, which boundary got blurred, which state got misread.
Two, what impact did it cause. Did it waste one handoff, did it revive an old conclusion, did it almost let local-only get treated as a launch authorization.
Three, how do I view this problem now. What changed in the way I judge, and why the old way can't be used anymore.
Four, what to do next time. Should it become a rule, a checklist, a skill, a task contract — or should it only be kept as raw evidence.

If these four can't be answered, don't rush to call it knowledge. At best it's a record.

The Knowledge Model compresses from raw material, work fragments, and judgments down to rules and long-term assets

I split knowledge into four layers, and don't let them impersonate each other

Now I put material into four positions.

Raw evidence layer: chats, logs, screenshots, test output, diffs. Their job is to prove something happened — not to directly guide the next step.
Current state layer: task README, notes, handoff, state files. Their job is to answer "where exactly are we right now."
Long-term rules layer: AGENTS.md, skills, checklists, memory entries. Their job is to change AI behavior going forward.
Public expression layer: articles, retrospectives, methodology. Their job is to rewrite internal experience into experience others can understand too.

These four can't mix. Evidence can't impersonate state, state can't impersonate rules, and rules shouldn't be directly exposed as a public article. A lot of past confusion came from these layers mixing: a chat treated as a rule, a report treated as a verdict, a memory summary treated as current state.

The value of the Knowledge Model is right here: it doesn't help me remember everything; it forces me to put things in the right place.

The result: memory doesn't grow, handoff gets lighter

What this method produces isn't "the AI suddenly understands everything" — it's that handoff cost drops.

A new window shouldn't have to read through full chat history. It should look at current state first, then recent notes, then handoff, then chase raw evidence as needed. It shouldn't infer current conclusions from a pretty report, and certainly shouldn't decide the next step straight from a memory summary.

For me, long-term knowledge isn't a warehouse of "what I once said." It's a tool for "what I don't want to repeat next time." A construction site without end-of-day is just a pile of materials. Only after the day is closed out can it become an asset usable at the next start of work.

So my Knowledge Model isn't about making the AI remember more — it's about making the AI misuse old material less.

At the end of each task, what I actually want to leave behind isn't full chat — it's four things: the problem, the impact, the change in judgment, the next action. Answer these four and the experience earns its place in long-term knowledge. Can't answer them? Keep it in the evidence layer, and don't pollute future judgment.