Unicode-апостроф в системном промпте Claude Code и сценарий аудита безопасности ИИ

2026 Claude Code: стеганография и скрытый Unicode-отпечаток Anthropic

Конец июня 2026: отчёт реверса на thereallo.dev фиксирует, что бинарник Claude Code (не web-Claude) при ANTHROPIC_BASE_URL, указывающем на неофициальный прокси, модифицирует строку Today's date is... в system prompt через текстовую стеганографию — разделитель даты и визуально идентичный Unicode-апостроф кодируют биты классификации (TZ, домен прокси, keyword-match), которые уходят на backend в каждом запросе. Anthropic удалил код в 2.1.197 (01.07.2026, changelog молчит). Вероятная цель: anti-distillation / anti-reselling; спор — obfuscation и zero disclosure. Не путать с тихой инъекцией Claude Desktop (апрель): ниже — жёсткое разделение события A и события B.

1. Три технических pain point: зачем аудит Claude toolchain сейчас

  1. Trust boundary нарушен без consent: по The Register и Alexander Hanff, Claude Desktop пишет Native Messaging manifests в Chrome/Edge/Brave/Arc/Vivaldi/Opera/Chromium без UI-согласия; Noah Kenney (Digital 520) воспроизвёл; Antiy Labs опубликовал threat analysis.
  2. Covert classification channel в каждом HTTP-запросе: в Claude Code 2.1.193, 2.1.195, 2.1.196 идентичная stego-логика; domain/keyword list ~147 entries, obfuscated base64 + XOR(key=91, 0x5B); keywords: deepseek, moonshot, zhipu, minimax, baichuan, stepfun, 01ai и др.
  3. Disclosure после community RE: Anthropic подтвердил код, выпустил 2.1.197 01.07 без changelog entry; HN thread 350+ points, 100+ comments — split «legit anti-distillation» vs «borderline malicious for devtools».

2. Два инцидента: таблица A vs B

HN/Reddit/security audience требует fact-level precision. Смешивание событий бьёт по E-E-A-T. Два независимых, но связанных инцидента:

Параметр Событие A: тихая browser injection Событие B: system prompt steganography
Артефакт Claude Desktop (macOS client binary) Claude Code (CLI, npm/bundled binary)
Источник Alexander Hanff (privacy advisor, The Register) Static/dynamic RE, thereallo.dev; amplification Reddit → HN
Timestamp 2026-04 (~18.04) 2026-06-30
Payload behavior Writes com.anthropic.claude_browser_extension.json to NativeMessagingHosts; pre-auth 3 extension IDs → chrome-native-host outside sandbox; dirs created even w/o extension installed; delete → resurrect on Desktop restart On ANTHROPIC_BASE_URL ≠ api.anthropic.com, mutates Today's date is... line in system prompt
Community label spyware / backdoor prompt steganography / covert channel
Trigger predicate Desktop install/launch (unconditional) Non-official Base URL only; not every session guaranteed
Vendor response No formal public statement; later opt-in toggle, privilege model intact Code acknowledged; removed in 2.1.197 2026-07-01; silent changelog

3. Timeline и версии бинарника

Date Milestone
2026-04 Hanff / The Register: Claude Desktop Native Messaging injection; Malwarebytes, gHacks, YOOTA coverage
2026-06-30 thereallo.dev publishes Claude Code prompt stego RE; HN frontpage
2.1.193–2.1.196 RE confirms identical stego logic across all three builds
2026-07-01 Claude Code 2.1.197 shipped — stego code stripped (no changelog note)

4. Механизм: Unicode mapping и decode pipeline

Precondition (per RE report): ANTHROPIC_BASE_URLapi.anthropic.com — proxy, API gateway, third-party reseller, OpenAI-compat shim. Official endpoint: no mutation.

On trigger, two bit-encoding ops hide inside benign Today's date is ... system string:

  1. Date separator bit: if system TZ ∈ {Asia/Shanghai, Asia/Urumqi}, rewrite 2026-06-302026/06/30 (hyphen → slash).
  2. Apostrophe codepoint swap: replace glyph in Today's with homoglyph Unicode variants — invisible to human eye, machine-readable on backend:
Match condition Date format Glyph Codepoint
Default (no rule hit) 2026-06-30 ' APOSTROPHE U+0027
CN institution/reseller domain match 2026-06-30 ' RIGHT SINGLE QUOTATION MARK U+2019
CN AI lab keyword match 2026-06-30 ʼ MODIFIER LETTER APOSTROPHE U+02BC
Domain + keyword both hit 2026-06-30 ʹ MODIFIER LETTER PRIME U+02B9
+ CN timezone overlay 2026/06/30 (apostrophe rules as above)

Decode pipeline для rule list: embedded blob → Base64 decode → byte-wise XOR with constant 91 (0x5B) → plaintext domain strings + lab keywords. Total ~147 rules. Backend парсит codepoint + separator → user classification tag без явного metadata header — textbook covert channel в sideband Unicode plane.

Threat model note: сигнал exfiltrated inside prompt body, обходит типичный API logging, который не hex-dumps Unicode normalization forms.

5. Мотив Anthropic: anti-distillation и HN flame war

Community consensus (cautious): detect unauthorized distillation + API reselling. Anthropic/OpenAI/Google публично warning про bulk API output harvesting для training smaller models. CN-linked proxies/resellers/labs — primary suspicion vector → fingerprint logic deployed.

Goal defensible, implementation toxic для devtool: classification bits stego-encoded, code obfuscated, zero user-facing toggle, embedded per-request. HN split:

  • Pro-defense camp: «Legitimate anti-distillation; technical countermeasure against API abuse required.»
  • Anti camp: «For a dev CLI this is borderline malicious — undeclared, non-disableable, hidden in punctuation.»

Intent (anti-distillation) ≠ proven fact; label «per RE report» / «alleged». Спор — means (covert, obfuscated, undisclosed), not stated goal.

6. Spyware или covert channel: точная таксономия

«Spyware» — loaded term. Engineering taxonomy:

  • Event A: unauthorized third-party software modification + dormant high-privilege attack surface — pre-wires chrome-native-host bridge outside browser sandbox. Anthropic disclosed Claude for Chrome prompt injection success: 23.6% unmitigated, 11.2% mitigated.
  • Event B: undeclared covert telemetry / user bucketing — not file-stealing keylogger malware, but textbook covert channel in application-layer protocol (system prompt).

Common denominator: no informed consent + intentional concealment.

7. Decision matrix: risk × scenario

Scenario Event A risk Event B risk Mitigation
Official api.anthropic.com only Medium (Desktop may inject) None Audit Desktop Native Messaging; upgrade Code normally
Third-party proxy/gateway Medium High (≤ 2.1.196) Force upgrade 2.1.197+; review proxy ToS
CN TZ + proxy Medium High (date + apostrophe dual signal) Assume historical classification; migrate to auditable node
Enterprise CI/CD + Claude Code High High Isolated runner, least privilege, ban Desktop agent on CI host

8. Пятиступенчатый runbook проверки

  1. Inspect ANTHROPIC_BASE_URL: echo $ANTHROPIC_BASE_URL. Empty or official → event B predicate false. Proxy URL → high-risk cohort.
  2. Upgrade Claude Code ≥ 2.1.197: stego stripped 2026-07-01. Verify: claude --version.
  3. Scan Native Messaging manifests (event A):
    find ~/Library/Application\ Support -name "com.anthropic.claude_browser_extension.json" 2>/dev/null
    Paths: ~/Library/Application Support/Google/Chrome/NativeMessagingHosts/ + Chromium forks. Desktop restart may recreate.
  4. TZ + proxy domain cross-check: systemsetup -gettimezone; if Asia/Shanghai or Asia/Urumqi + proxy domain in 147-rule set → dual signal likely fired historically.
  5. Enterprise isolation: decouple Claude Code from build secrets/prod repos; no Desktop agent on CI runner shared UID; enforce explicit consent + audit logs; consider dedicated remote Mac node.

9. Vendor overreach: что делать инженеру

Core issue не apostrophe glyph, а gap между model capability explosion и security boundary / consent / audit maturity. Vendors cross trust boundaries under «UX» or «abuse prevention» — desktop AI agent phase repeats PC/smartphone early security failures.

Engineering response stack:

  1. Default deny trust; require reproducible evidence.
  2. Demand disclosure over steganography — anti-distillation can be explicit with opt-out.
  3. Least privilege + blast-radius isolation for any desktop agent binary.
  4. Market + regulatory pressure as ultimate bound on boundless tech.

Capability ↑ ⇒ self-constraint ↑ — не секрет, который community находит через binary RE.

10. FAQ

Q: Claude Code — spyware?
Not classic spyware; per RE — undeclared obfuscated fingerprints in system prompt; removed 2.1.197. Accurate: undeclared covert channel.

Q: Timezone detection?
Only with non-official ANTHROPIC_BASE_URL: Asia/Shanghai or Asia/Urumqi → separator - to /.

Q: Unicode apostrophe trick?
U+0027 / U+2019 / U+02BC / U+02B9 encode domain hit, lab keyword, both, or default.

Q: Why Anthropic added this?
Community: distillation + unauthorized reselling detection — valid goal, unacceptable covert implementation.

Q: Same as Desktop spyware?
No. Desktop injection = event A (April); Code stego = event B (June 30).

Q: Web Claude users affected?
Event B: Claude Code + non-official Base URL only.

Q: Remove Desktop injection file?
Delete com.anthropic.claude_browser_extension.json from NativeMessagingHosts; restart may recreate.

Q: Trust Anthropic?
Risk tolerance + compliance driven; evidence-based decision; enterprise → independent audit.

11. Источники и disclaimer

  • The Register: Claude Desktop changes software permissions without consent (2026-04)
  • Malwarebytes / gHacks / YOOTA: Claude Desktop native messaging
  • thereallo.dev: Claude Code prompt steganography (original RE)
  • Tech Startups / TMC Insight / Developers Digest / TechTimes: event B + 2.1.197 fix
  • Antiy Labs: high-privilege browser channel analysis
  • Hacker News thread (350+ points, ~2026-06-30)

Compiled from public reports + RE; motivation (anti-distillation) and means (stego) evaluated separately; Anthropic intent marked «per community / per RE»; not legal advice. Last updated: 2026-07-03.

12. Remote Mac и SFTPMAC bridge

Claude Code + Claude Desktop privilege problem reduces to: where does high-privilege agent run, and what data shares its UID namespace? Laptop or CI-mixed host = build secrets + SSH keys + prod repos in same user context — uncontrollable blast radius when Native Messaging injects or covert classification fires.

Harder mitigation: isolate Claude Code/agent workflows on dedicated always-on macOS node, physically separated from browser, Desktop client, daily driver; sync workspace via SFTP/rsync for rollback + audit trail. Aligns with сравнение AI coding assistants and Claude Fable 5 export control remote Mac 7×24 agent node guidance.

If evaluating Claude toolchain trust boundaries: decouple agent from sensitive assets → isolated auditable Apple Silicon remote node. SFTPMAC remote Mac rental — Claude Code / OpenClaw always-on env: native launchd, SSH/SFTP dir isolation, CI/CD artifact sync baseline; beats «home Mac as agent + daily browser» for least-privilege teams.