‹ Back to notes

Field Note

Claude Code Was Marking the Prompt

Claude Code 提示词里多了标记

A Chinese security graphic claimed Claude Code had a hidden backdoor for detecting Chinese users. I checked the Chinese mirrors, GitHub issue, technical write-up, official changelog, npm package, and my own machine. The mechanism exists and is reproducible in the package. That is not the same as proving local backdoor behavior.

Claude Codeprompt-markerforensicsprivacyAI-securitypublic-safe

AI Tooling · Local Forensics

A FreeBuf-style graphic came across my desk today claiming Claude Code had a hidden backdoor for detecting Chinese users. I did not start by believing it, and I did not start by dunking on it. I checked the public claims, the official changelog, the npm package, and my own machine.

My conclusion: the mechanism exists. Under specific conditions, Claude Code reads ANTHROPIC_BASE_URL, checks whether the system timezone is Asia/Shanghai or Asia/Urumqi, compares the host against an obfuscated domain list, and then changes one boring-looking date sentence in the system prompt. But I would not write this as “local backdoor proven.” What I can verify is an undisclosed prompt marker, or environment fingerprint. Bad. Weird. Not the same as proven remote code execution, file theft, or local directory exfiltration.

YunLab local verification screenshot showing Claude Code package SHA256, decoded domain count, timezone, and ANTHROPIC_BASE_URL status.
My local verification of the 2.1.197 macOS arm64 package: SHA256, decoded domain count, local timezone, and whether the current shell has the proxy variable set.

The Claim

The Chinese spread is real. A mirror of the FreeBuf article says the mechanism dates back to 2.1.91 and claims it uses proxy state, timezone, and domain matching to hide environment signals inside prompt characters. That style of article has a problem: it collapses “the mechanism exists” and “this is a backdoor” into one sentence. The useful question is narrower: what triggers it, what does it encode, and what evidence exists for anything beyond prompt marking?

Screenshot of a FreeBuf mirrored Chinese page about the Claude Code hidden backdoor claim.
The Chinese mirror proves the claim is circulating. It does not prove every word in the headline.

How I Checked It Online

The GitHub issue, #72518, puts the accusation in a concrete form: the code appears to detect proxy usage, Chinese timezone settings, and proxy URL characteristics, then encode the result into subtle system-prompt differences.

Screenshot of GitHub issue 72518 in anthropics/claude-code.
The issue is useful because it turns the outrage into a checkable technical question.

The most useful public write-up I found was thereallo.dev's technical post. It lays out the function shape, trigger, timezone checks, XOR-91 domain list, and the final prompt string change.

Screenshot of thereallo.dev technical post about Claude Code prompt steganography.
The valuable part is the function logic and encoding path, not the label people attach to it.

The official changelog matters too. Version 2.1.196 explicitly says Remote Control is disabled when ANTHROPIC_BASE_URL points at a non-Anthropic host. That proves the custom base URL path was an active product boundary. I did not find a public changelog note for the prompt-marker behavior itself.

Screenshot of official Claude Code changelog around version 2.1.196.
The changelog documents proxy-related behavior, but not this prompt-marker mechanism.

My Own Local Evidence

I pulled the 2.1.197 macOS arm64 package and inspected the binary locally. This is the difference between “someone online said it” and “I can reproduce the key evidence from the package on this machine.”

  • Package evidence: the 2.1.197 macOS arm64 binary hash is 8cc0c4d1e4eb1dca3b0cc92ab02ee3505de764e023f8c901761c167b72041fb8.
  • String evidence: the package contains ANTHROPIC_BASE_URL, Asia/Shanghai, Asia/Urumqi, and Today${n}s date is.
  • Domain evidence: I decoded 147 domains locally. Samples include cn, baidu.com, alibaba-inc.com, bytedance.net, moonshot.ai, aliyuncs.com, and openclaude.me.
  • Machine evidence: my PATH currently runs claude 2.1.150; the system timezone is Asia/Shanghai; the current shell does not set ANTHROPIC_BASE_URL, so this shell does not trigger that path.

Reduced into a decision path, the package does roughly four things: check whether ANTHROPIC_BASE_URL exists; extract the host if it does; compare timezone and host against the China timezone and decoded domain list; then decide whether the date uses slashes and which Unicode apostrophe appears in Today's. The marker is not random. It is built from those environment classifications.

The boundary matters: China timezone alone is not enough. This path becomes interesting when Claude Code is routed through a custom base URL and the host lands in a domain or keyword category the client cares about.

What It Actually Sends

It does not spin up a separate network call. It changes the system context that is already going to the model.

Today's date is 2026-07-01.

Under the marked condition, two tiny things can change: the date separator becomes a slash for the China timezone case, and the apostrophe in Today's becomes one of several Unicode lookalikes depending on host classification.

To a person, the sentence still looks boring. To a backend that knows the rule, it can encode whether this looks like a China timezone, a known host, or a lab-keyword host.

Why It Still Matters

I understand why an API provider wants to detect resellers, unauthorized gateways, and model-distillation pipelines. A custom ANTHROPIC_BASE_URL is a useful signal.

The implementation is the problem. Claude Code is not a normal web app. It is a developer tool that reads files, edits code, runs shell commands, touches git, and sometimes talks to a browser. That kind of client should be boring, visible, and easy to audit. If it wants to detect custom gateways, it should say so in docs, release notes, telemetry fields, or policy settings.

Hiding classification bits in prompt punctuation makes every privacy claim harder to trust.

What I Will Not Claim

I will not claim this proves Claude Code steals local files. I do not have that evidence.

I will not claim every Chinese user is being tagged. From this path, timezone alone is not enough.

The careful version is enough: Claude Code contains an undisclosed prompt-steganography marker that combines custom base URL, timezone, domain matching, and Unicode punctuation to encode environment classification into the system prompt.

The word I remember is not “backdoor.”

It is that a shell-capable developer tool hid environment classification inside a harmless-looking date sentence. That design should not require reverse engineering to be seen.

Turn this note into a route

After reading, ask a follow-up, return to the curated archive, or use the tag index to follow the same thread.

Ask about this Open archive Browse tags